lutaml / expressir

Ruby parser for the ISO EXPRESS language
3 stars 3 forks source link

Implement Ruby-only parser for Expressir #156

Open ronaldtse opened 5 months ago

ronaldtse commented 5 months ago

EXPRESS is a data modeling language that provides schemas and entities. It has a well-defined EBNF grammar.

The grammar of EXPRESS is provided at this repository in the ANTLR grammar language:

“expressir” is an EXPRESS parser written in Ruby:

Expressir uses a C++ extension to do the parsing inside Ruby, through the ANTLR grammar provided above.

However, the C++ ANTLR processor has proven to be problematic. We need to develop a new front-end parsing engine in plain Ruby to replace the existing code.

This task is to:

ronaldtse commented 1 month ago

This means we can get rid of the "rice" interface which requires C++ compilation.

This task is necessary for parsing the STEPdev library of EXPRESS schemas and performing manipulation on them.

NOTE: It was a terrible (i.e. very terrible) mistake to use ANTLR to generate the parser, which has caused us much agony and costs in maintaining the parser.

chrim05 commented 4 weeks ago

Hi, I'm https://www.upwork.com/freelancers/~0135fea7d5f7083798 on Upwork (I didn't apply for the job because I don't have enough connects), but here my approach:

For first we need to remove completely ANTLR from the project, then we can start building a frontend top down recursive parser which is the easier to maintain. With the parser we also need a tokenizer in order to recognize the tokens to make syntax checks. The parser will generate a node-based tree which will be converted to the corresponding Ruby class as stated in convert every EXPRESS grammar object into its corresponding a Ruby class, as already provided in the Expressir Ruby source.. Let me know, thanks

adilmahmoodc commented 4 weeks ago

Hi my upwork profile is https://www.upwork.com/fl/adilm8 Here will be my approach I’ll review the existing ANTLR grammar and Expressir structure. Then, I will convert the EXPRESS grammar from ANTLR to parslet, defining grammar rules in Ruby to match the existing setup. Using parslet, a Parsing Expression Grammar (PEG) library, I will check the grammar is accurately translated. The parsed data will be mapped to the existing Expressir Ruby classes to maintain full compatibility. Finally, I’ll create and run all RSpec tests to confirm that the new parser works correctly.

micwonder commented 4 weeks ago

Hello, I hope this message finds you well. I came across the GitHub issue regarding the development of a Ruby-based parser to replace the ANTLR parser in the Expressir gem, and I would like to express my interest in contributing to this project. Having reviewed the requirements, I propose the following approach:

  1. Parser Development: I will utilize the parslet gem to create a Ruby parsing front-end for the EXPRESS language. This will involve defining grammar rules that closely match the existing ANTLR grammar while ensuring maintainability and performance.
  2. Object Mapping: Each parsed object will be mapped to its corresponding Ruby class as defined in the Expressir library. This will ensure compatibility with existing structures and functionalities.
  3. Testing: I will run all existing RSpec tests to confirm that the new parser functions correctly and meets performance benchmarks comparable to the previous implementation. Additionally, I will develop a benchmark suite to demonstrate that performance is at least on par with the ANTLR-based parser.
  4. Documentation and PR: To demonstrate my approach, I plan to create a pull request once I have a working prototype. This will include detailed documentation to facilitate review and integration. I am enthusiastic about this opportunity and believe my skills align well with the project requirements. Please let me know if you would like to discuss this further or if there are any specific guidelines you would like me to follow. Thank you for considering my proposal. Looking forward to hearing from you soon!