BNFC / bnfc

BNF Converter
http://bnfc.digitalgrammars.com/
586 stars 165 forks source link

Space awareness for operators #208

Closed zhaar closed 5 years ago

zhaar commented 7 years ago

Hi,

I'm trying to define a grammar for a language that makes use of spaces around identifiers in order to accepts both those strings unambiguously

lhs > rhs

lhs> rhs

where the first line is two identifiers with a infix operator in the middle and the second one is a postfix expression (lhs>) followed by another identifier.

Similarly with prefix

lhs . rhs 

lhs .rhs

Is it possible? if so how and otherwise why?

SzymonPajzert commented 7 years ago

Are these identifiers defined beforehand or you want it to be customisable by user?

On Mon, 18 Sep 2017 at 00:48 Zephyz Zhaar notifications@github.com wrote:

Hi,

I'm trying to define a grammar for a language that makes use of spaces around identifiers in order to accepts both those strings unambiguously

lhs > rhs

lhs> rhs

where the first line is two identifiers with a infix operator in the middle and the second one is a postfix expression (lhs>) followed by another identifier.

Similarly with prefix

lhs . rhs

lhs .rhs

Is it possible? if so how and otherwise why?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/BNFC/bnfc/issues/208, or mute the thread https://github.com/notifications/unsubscribe-auth/AMMfe2tt-gRwgdX5QsiS4mCKKMuIUJr9ks5sja_HgaJpZM4PaUsw .

zhaar commented 7 years ago

defined beforehand

edit: I misread your reply. I meant that the identifiers are defined by the user. the operators are defined beforehand

SzymonPajzert commented 7 years ago

Sorry for the long reply. Probably you have already solved that but for the sake of competition I'll post my answer.

I don't think it's possible to do purely in BNFC. I'd suggest in this case modify the lexer to not split words on some subset of identifiers. That depends on the backend you use.

SzymonPajzert commented 7 years ago

Using Haskell I haven't changed generated lexer but created a wrapper around it. Take a look at the code I wrote some time ago to split an SQL script into a list of commands.

module Compile.Parse (parse) where

import Grammar.Par (pProgram, myLexer) -- generated by BNFC
import Data.List.Split (splitWhen)

commandDelim :: Tok
commandDelim = let [PT _ token] = myLexer ";" in token

isCommandDelim :: Token -> Bool
isCommandDelim (PT _ token) = token == commandDelim
isCommandDelim _            = False

lexMany ::  String -> [[Token]]
lexMany file = lex file |> -- I like this OCaml operator
  splitWhen isCommandDelim

parseMany :: String -> [Err Program]
parseMany = (map pProgram) . lexMany -- pProgram comes from BNFC as well

What it does it applies lexer to the content of the file and then modifies the tokens without modifying lexer itself. As I understand your problem, you could replace every space with some reserved symbol that would be then passed to lexer (problem with default BNFC is that is not whitespace sensitive). After that you could traverse the list of [Token] and coming across any of your predefined operators and looking at the surrounding spaces, you could define what kind of operator is it (prefix, infix, suffix).

andreasabel commented 5 years ago

This feature is out of scope for BNFC.