Open Foadsf opened 3 years ago
I agree partly. The situation that many parsers exists comes basically from the long history. There hasn't historically been any libraries to support this. So what we did was write a parser every time a new file format was needed. Often the parser has to be written in reverse engineering because many formats are not described anywhere. This and the fact that formats may even change adds to the challenge. Currently there is no major activity to write new parsers. Hence there are no resources used that could be moved to this new strategy. It would be great to see an all-to-all-formats auxiliary program would rise from the open source community. Now there seems to be parsers for -to-Elmer, -to-foam etc., but no generic tool
Thanks Peter @raback for the response. You touched upon some nice issues. Indeed, many of the used file formats have been changed, and keeping them up to date is not easy. Using off-the-shelve parser generators, we can also read other formats such as ANSYS's APDL and ABAQUS/CalculiX input files.
We don't have to do this overnight or all in once. We may start with the ElmerSolver's .sif files, as they, in my experience, cause the most headache at the moment. And then move on to the others. I will help if you support the idea. And I believe it will lead to a much more stable and maintainable parser.
P.S. Here in this repository you may see a collection of lexer/parser grammar files that we can start with.
The more I dig into the Elmer code base and the more I use the tool, the more I'm surprised that the development team is dealing with massive issues that is taking a lot of their time and effort. For example, the Elmer team is building a huge number of parsers:
ElmerGrid:
ElmerSolver:
General
This is really impressive but as a result a very complicated effort to maintain all these manually. While the developers should have actually focused on the core technology, numerically solving complex systems of PDEs. And Elmer's parsers are also very fragile. a very small typo can cause segmentation faults without further error messages. Small syntax mistakes might takes hours to debug.
My proposal to solve the issue is to use parser generators. For example, ANTLR4 seems to be an industry standard at the moment. What we need to do:
.g4
lexer and parser grammar files for the above file formats / languages.gitmodules
for the time beingA nice and easy tutorial for ANTLR can be seen here.