Closed zhaar closed 5 years ago
Are these identifiers defined beforehand or you want it to be customisable by user?
On Mon, 18 Sep 2017 at 00:48 Zephyz Zhaar notifications@github.com wrote:
Hi,
I'm trying to define a grammar for a language that makes use of spaces around identifiers in order to accepts both those strings unambiguously
lhs > rhs
lhs> rhs
where the first line is two identifiers with a infix operator in the middle and the second one is a postfix expression (lhs>) followed by another identifier.
Similarly with prefix
lhs . rhs
lhs .rhs
Is it possible? if so how and otherwise why?
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/BNFC/bnfc/issues/208, or mute the thread https://github.com/notifications/unsubscribe-auth/AMMfe2tt-gRwgdX5QsiS4mCKKMuIUJr9ks5sja_HgaJpZM4PaUsw .
defined beforehand
edit: I misread your reply. I meant that the identifiers are defined by the user. the operators are defined beforehand
Sorry for the long reply. Probably you have already solved that but for the sake of competition I'll post my answer.
I don't think it's possible to do purely in BNFC. I'd suggest in this case modify the lexer to not split words on some subset of identifiers. That depends on the backend you use.
Using Haskell I haven't changed generated lexer but created a wrapper around it. Take a look at the code I wrote some time ago to split an SQL script into a list of commands.
module Compile.Parse (parse) where
import Grammar.Par (pProgram, myLexer) -- generated by BNFC
import Data.List.Split (splitWhen)
commandDelim :: Tok
commandDelim = let [PT _ token] = myLexer ";" in token
isCommandDelim :: Token -> Bool
isCommandDelim (PT _ token) = token == commandDelim
isCommandDelim _ = False
lexMany :: String -> [[Token]]
lexMany file = lex file |> -- I like this OCaml operator
splitWhen isCommandDelim
parseMany :: String -> [Err Program]
parseMany = (map pProgram) . lexMany -- pProgram comes from BNFC as well
What it does it applies lexer to the content of the file and then modifies the tokens without modifying lexer itself. As I understand your problem, you could replace every space with some reserved symbol that would be then passed to lexer (problem with default BNFC is that is not whitespace sensitive). After that you could traverse the list of [Token] and coming across any of your predefined operators and looking at the surrounding spaces, you could define what kind of operator is it (prefix, infix, suffix).
This feature is out of scope for BNFC.
Hi,
I'm trying to define a grammar for a language that makes use of spaces around identifiers in order to accepts both those strings unambiguously
where the first line is two identifiers with a infix operator in the middle and the second one is a postfix expression (
lhs>
) followed by another identifier.Similarly with prefix
Is it possible? if so how and otherwise why?