Open ahankinson opened 3 months ago
Not sure why it's failing -- the tests all pass and work on my end.
It occurs to me that a big part of the reason for the speed-up is the use of the functools lru_cache, which will cache the result of the same input string and skip the computation of that value, instead returning only the cached value. While this is useful in real-world contexts, it's probably not so useful in benchmark runs!
@aweakley does this PR look OK?
I've done a bit of cleanup, but the parser_classes file is still quite buggy. Unfortunately I don't quite know how to trigger the bugs, but I'm at least confident that I didn't make it much worse.
Thank you. I'll check it out.
@ahankinson The tests are passing for me, are there particular bugs you were worried about in your comment above?
Yeah the tests pass but I don’t think they exercise the parser classes that much. I have the ruff checker running on that and it still highlights a bunch of stuff
It's great to see that this project has some life in it now. I have been using it for a while and maintaining my own fork with some performance enhancements.
Primarily, the main enhancements have been to enable packratting (which seems to now be enabled) and pre-declaring and compiling the regexes in the natural language processor.
If the current benchmarks are to be believed, this leads to a significant speedup. All the tests pass, so I guess it's OK?
https://rism-digital.github.io/python-edtf/dev/bench/
I've gone through and tried to integrate the latest changes, but that was a big PR so another set of eyes would be helpful.