Closed andreasabel closed 5 years ago
Any connection to #77?
Looking at the docs, it seems that bnfc does implement its specifcation:
But, if the token t following of is not an opening curly bracket, a bracket is inserted, and the start column of t is remembered as the position at which the elements of the layout list must begin. Semicolons are inserted at those positions. When a token is eventually encountered left of the position of t (or an end-of-file), a closing bracket is inserted at that point.
However, it seems that specification is not matching the intuition. I some how miss that the token following the layout token should be indented more to give sensible behavior (and avoid this issue).
I believe the layout mechanism was originally added to implement haskell-ish layout syntax, where the following things are acceptable:
do
putStrLn "Hello World"
and even
do
putStrLn "Hello World"
(I'm not saying one should do that, but it is valid haskell...)
So, even if I agree that it is kind of counter-intuitive, it seems that it is the intended behavior.
But Haskell rejects
{-# LANGUAGE NondecreasingIndentation #-}
test = do
do
putStrLn "Hello, World!"
whereas bnfc's layout mechanism accepts it. Considering the grammar
layout "do";
Decl. Decl ::= Ident "=" Exp;
Var. Exp ::= Ident;
Do. Exp ::= "do" "{" [Exp] "}" ;
separator Exp ";" ;
and the test file
test = do
do
test
the generated parser responds:
Parse Successful!
[Abstract Syntax]
Decl (Ident "test") (Do [Do [Var (Ident "test")]])
[Linearized tree]
test = do {
do {
test
}
}
Maybe instead of indented more I should have said not indented less which corresponds to Haskell's NondecreasingIndentation
.
I am fixing this towards a new layout block needs to be indented strictly more than its enclosing layout block. This means that next token after the layout keyword only determines the new indentation column if it is strictly more indented than the previous indentation column. Otherwise, the next token will close immediately the new layout block (and continue in the previous block (or even close more blocks)).
Given this grammar,
Modl. Module ::= "module" Ident "where" "{" [Module] "}" ;
separator Module ";" ;
layout "where" ;
the example
module Top where
module A where
module B where
module B1 where
module B1A where
module B2 where
module C where
module C1 where
parses as intended as
module Top where {
module A where {
} ;
module B where {
module B1 where {
module B1A where {
}
} ;
module B2 where {
}
} ;
module C where {
module C1 where {
}
}
}
Consider the grammar file
Tree.cf
:and a sample file
tree.txt
:I would have expected AST
printed e.g. as
but the bnfc-generated parse produces
printed as
I think something is wrong with the handling of empty blocks following a layout keyword.