Use ANTLR4 for specification validation

dlmarion commented 1 year ago

I think there is more to do here in the tests, but wanted to get something up sooner rather than later.

dlmarion commented 1 year ago

@ctubbsii - I think @keith-turner might be looking for your input / changes on the pom overall.

ctubbsii commented 1 year ago

@ctubbsii - I think @keith-turner might be looking for your input / changes on the pom overall.

I will definitely go spelunking in the POM and GitHub Actions to polish up anything prior to release... but for now, I don't think it's a particularly high priority while the effort is in its initial rapid prototyping and development. If you think it's holding something up, I can definitely prioritize it, though.

dlmarion commented 1 year ago

In c43e905 I modified the ANTLR grammar to modify the parse tree such that all of the tokens in an OR and AND expression were below that node in the tree. Added test that performs all of the evaluations in testdata.json. At this point the ANTLR expression parsing tests and AccessExpression evaluation tests are complete and working.

dlmarion commented 1 year ago

It feels very weird to write the specification in ABNF, and then to have a test to verify that the ABNF is properly written using an imported G4 specification for ABNF (which is imported to our package namespace, which doesn't need to be).

I wanted to verify that our ABNF was correct and I used ANTLR to do it since it already has a grammar written for it. Our ABNF in the main branch is not correct and there are updates to it in this PR.

And then, on top of that, have an entirely separate G4 specification that is basically redundant with our ABNF spec. Why not just write the specification in G4 directly and drop all the ABNF stuff? We don't need to define the specification in two separate languages. We should just have one authoritative one.

I think we can delay the decision to use ABNF or ANTLR as the specification grammar until we determine what we are going to do with the ANTLR implementation (whether the ANTLR implementation is suitable to be our runtime implementation or just an example that others can leverage).

dlmarion commented 1 year ago

Results from running benchmark from 2971c0d:

Benchmark                                                    Mode  Cnt  Score   Error   Units
AccessExpressionAntlrBenchmark.measureBytesParsing          thrpt   12  0.829 ± 0.012  ops/us
AccessExpressionAntlrBenchmark.measureEvaluation            thrpt   12  1.995 ± 0.013  ops/us
AccessExpressionAntlrBenchmark.measureEvaluationAndParsing  thrpt   12  0.519 ± 0.022  ops/us
AccessExpressionAntlrBenchmark.measureStringParsing         thrpt   12  0.851 ± 0.025  ops/us

Certainly not as good as the results listed in #7 with the old or new implementation.

apache / accumulo-access

Use ANTLR4 for specification validation #4