delph-in / matrix

The Grammar Matrix
https://matrix.ling.washington.edu/index.html
Other
12 stars 6 forks source link

Regression tests miss grammars LKB won't load #319

Open goodmami opened 4 years ago

goodmami commented 4 years ago

Migrated from Trac:

The regression tests use ACE, which seems to be more tolerant of lexrule redundancy. This has the result that a regression test can pass while the corresponding grammar will not load in LKB. (The most obvious way this shows up is if lex-rule is added as a supertype to something that already inherits from it.)

goodmami commented 4 years ago

@curtosis can you clarify what is the expected resolution here? That the regression tests load something in the LKB, that we expand validation checks to model LKB requirements, that we just fix the matrix code to not output redundant lexrules, or something else?

curtosis commented 4 years ago

For the example I cited, the "correct" solution would be to fix the Matrix code to be smarter about building type hierarchies, but that should be split off into a different issue.

I think my main concern here is that it's surprising when the regression tests pass with something that is invalid for the LKB -- which seems like a kind of actual regression that the tests are supposed to catch.

I don't think it's reasonable to expect the regression tests to be responsible for making sure things work in all DELPH-IN engines; that way lies madness. On the other hand, the LKB isn't just another engine, so some expectations there seem justified. The "easy" resolution -- running regression tests in both ACE and LKB -- is obviously not workable from a performance perspective, even with just the two engines.

I don't have a sense of whether the set of things ACE allows that the LKB doesn't is easily enumerable, but it might be a workable partial solution to add a post-validation step that checks the ones we know about. (I say "post-validation" since the main point is to catch things while you're doing Matrix/customization development, which can include validation.)

dantiston commented 4 years ago

This is an example from several years ago, but I recently said it in front of Woodley and he didn’t take issue with it, so I suspect it still holds true. Woodley develops ACE with the expectation that the grammar is basically correct. The LKB does a lot more checking for validity, e.g. lexrule redundancy. Performance would take a huge hit if we always loaded grammars with the LKB at test time, so I agree that’s problematic. Also, as a developer that primarily uses macOS, it’s nice that matrix development can be done in a 100% macOS ecosystem (though with LKB-FOS that’s soon to be not as big of an issue).

I wonder if these sorts of checks for hierarchy validity are something PyDelphin could do?

goodmami commented 4 years ago

Thanks. I agree that we should try to ensure that grammars load with the LKB.

The "easy" resolution -- running regression tests in both ACE and LKB -- is obviously not workable from a performance perspective, even with just the two engines.

Agreed. I think we could have a --lkb option for rtest.py that uses the LKB, assuming we can script it. If all we care about, for now, is loadability and not equivalent performance/coverage, then we just need it to load a grammar and print out any errors. Does anyone know how to do this from the command line using LKB-FOS? Before we piped lisp commands to a logon script, but I'd like to do this using only the LKB if possible.

I wonder if these sorts of checks for hierarchy validity are something PyDelphin could do?

Partially. PyDelphin pretty much only reports errors in TDL syntax. You can write functions that perform additional checks (that all supertypes are defined somewhere, maybe even type hierarchy compatibility checks, etc.) but these will be shallow checks, like a linter, because PyDelphin does not compute GLBs or do unification. The lex-rule-redudancy example could probably be detected with a function.