Open nightlark opened 3 years ago
So my understanding, having played with the ANTLR files a bit myself, is:
Both SLEIGH compilers are actively maintained and perform the same function; converting .slaspec files to the .sla format. They may give slightly different error messages but should produce identical .sla files from the same .slaspec input. See this test.
The yacc version is built on the decompiler c++ code, and the decompiler can be built to read in .sla files using this infrastructure. In the main build however, the decompiler doesn't include this infrastructure but gets all its p-code from the SLEIGH engine in the Java-based part of Ghidra. This engine doesn't really care which SLEIGH compiler produced the .sla file its using; however it considers .sla files to be ephemeral and builds them automatically using the Java/ANTLR compiler if they're not already present.
For tools that work with SLEIGH source, you can use whatever compiler code base is more convenient; yacc for native code or ANTLR for Java-based. Any proposed change to SLEIGH itself would need to be implemented in both code bases.
I noticed that there are two types of grammar files in the repository, ones using ANTLRv3 that seem to be used by the Java part of Ghidra, and yacc grammar files that are part of the cpp decompiler. I'm trying to figure out what the difference between them is.
I basically don't know anything about this part of the code, and the documentation on them (particular the yacc files in the decompiler) is kinda lacking. What I could really use is someone to ELI5 how the various parsers/grammars in Ghidra fit together.