kaby76 / AntlrExamples

Some examples of parsing with Antlr
17 stars 7 forks source link

Question about 'EDIT' statement #2

Closed Heidernlee closed 4 years ago

Heidernlee commented 4 years ago

Many thank you for providing this Antlr4 Grammar. I just followed from this Issue [https://github.com/antlr/grammars-v4/issues/1752] It's so hard to find a good PL1 parser in 2020......

This G4 file Pared 80% of my example PL1 source code Successfully, Thanks again. Here's the question about this G4 file, For example, I saw the EBNF for "EDIT" is like: ["EDIT" { "(" data_list ")" "(" format_list ")" }+], Here's a [+] in it, I think it means "dataList" and "formatList" is a set that appears 1 or more times But from your G4 file , [editdataspec : '(' datalist ')' '(' editformatlist ')'], no [+].

Is there any reason to define the "EDIT" grammar like that ?

kaby76 commented 4 years ago

@Heidernlee Thanks for the note. I haven't had much time to continue updating that grammar for PL1, but I will at some point. Yes, that looks like a bug. Can you tell me where the EBNF for PL1 is? That would be very useful. The grammar in this Git repo was imported from Yacc SourceForge open source program. It could be that the import was bogus, or that I ran a transform that wasn't working over the imported grammar. At the time, I was doing things through VS2019, and the transforms were buggy. I now have a command-line tool that can generate reproducible results from the open-source Yacc grammar. Or, I can just redo the grammar from scratch from the EBNF you mentioned.

Heidernlee commented 4 years ago

@kaby76 Thanks for your response. This EBNF for PL1 is from here: https://www.cs.vu.nl/grammarware/browsable/os-pli-v2r3/ (Maybe it's the only one PL1 grammar on the Internet of 2020...) I'm just getting started of Grammar Parsing or Antlr4, the meaning of LL/LR/LALR always made me confused, I just noticed the other Antlr4 Parser (G4 file) is always like [XX]+ or [XX] , that makes a Tree like Parent-Childs. But your Parser used lots of Recursive Program instead of [+] or [] . Is that means your G4 is not LL but LALR or something else ?

kaby76 commented 4 years ago

Thanks for the link.

I just checked the editdataspec rule again, and I think it's okay. The rule came from converting the Yacc grammar into Antlr. I just hadn't transformed the grammar to EBNF. The EBNF syntax for grammars (e.g., "symbol+" or "symbol*") not only helps to make the grammar easier to read, but results in a faster parse and a flatter parse tree, which you mention.

This grammar is LR because it was derived directly from a Yacc and Bison grammar. Strictly speaking, it's not LL because it has left recursion. I did have to rewrite the grammar to remove indirect left recursion, but I left some direct left recursion if Antlr could handle it. Internally Antlr converts direct left recursion in a rule with EBNF before generating an LL parser (editdataspec : ('(' datalist ')' '(' editformatlist ')' | editdataspec ('(' datalist ')' '(' editformatlist ')'; => editdataspec : ('(' datalist ')' '(' editformatlist ')')+;). It's possible to convert LR to LL and vice versa, with the caveat that the parse trees may not be the same.

I think my plan for a PL1 grammar will be to scrape the Lämmel/Verhoef grammar directly from the website, update the current grammar in Antlr with EBNF, then compare this with the imported Yacc grammar. I will likely also go to the Language Reference doc by IBM, and try to scrape via tool a grammar from that. The link to Tom Everett's version for PL1 is gone, but I have enough versions here to get a good PL1 grammar. I believe in scraping grammars from elsewhere rather than typing in grammars from scratch in order to reduce errors.

It's good to learn about the various parsing algorithms. The Dragon book is a great book, so definitely pick up a copy.

Heidernlee commented 4 years ago

@kaby76 Thanks for your Response. Now I finally get the difference between LL and LR.....

After tested with many PL1 Programs, here's a weird thing, It's OK while use G4 File directly , but throw Exception while Parser=>Tree with Java, Like this:

Very Simple PL1 Source Code:

PUT FILE (SYSPRINT)
EDIT ('AS','AB') (2(B,A,COL(119),A))
(C,C) (2(B,A(120))) ;

It's a perfect Grammar Tree while use G4 File directly: 111

But I tried to read it with Java: 222

The following Exception threw: 333

Is the same Exception threw on your Environment?

BTW, Dragon Book is already bought from Rakuten Marketplace, Many thank you.

Heidernlee commented 4 years ago

@kaby76 Sorry, I fixed it , cause by wrong dependency version of Java Package. I'll keep working on this PL1 Parser, Hope one day it can be EBNF Format ^ ^

Mainframe Modernization is a very well-paid job in Japan these years, My next step is RPGII Parser .... Hope my Boss give me a break ~