Open azreika opened 4 years ago
We had actually a semantic checker in the first refactoring phase of RAM - however, we never really implemented a check and removed it at a later point in time.
It would be great to implement one which really checks the semantics of RAM.
There are various aspects that can be checked:
Souffle would benefit with a RAM semantic checker that ensures that the produced RAM code is valid, particularly just after AST to RAM translation.
Adding a semantic checker would mean that issues that arise from the translation process (or an intermediate RAM transformation) rather than the actual Datalog code could be caught easily. More worrisome is that incorrect RAM transformations might not produce an error in the final output, but instead just silently produce incorrect output on some input datasets.
As a particular case study, see issue #1499. When the interpreter was run, no error was thrown - the incorrect output was just produced on a test. Hours (many many long hours) were spent trying to debug the Datalog code to find the "flaw" in the input program by tracing back data flow, until I found out that compilation was crashing. After looking into the RAM program, a very clear semantic flaw was found, where variables were being used before being defined. A RAM semantic checker would have found the bug instantly, without user input, saving hours of intense analysis.
A RAM semantic checker would also be necessary in the future if support for users providing RAM programs directly to Souffle was added.