languagetool-org / languagetool

Style and Grammar Checker for 25+ Languages
https://languagetool.org
GNU Lesser General Public License v2.1
12.31k stars 1.39k forks source link

min, max parameters not included in the disambiguator #17

Closed milekpl closed 10 years ago

milekpl commented 11 years ago

Daniel, you forgot to add new functionality to the DisambiguatorPatternRule. I consider this a bug because the format of both files is kept in sync...

I guess skipMaxTokens should go to AbstractPatternRulePerformer?

danielnaber commented 11 years ago

The min/max stuff is still buggy, see the ignored tests in PatternRuleMatcherTest. So instead of porting the buggy change to disambiguation, I'm still trying to find a clean solution based on a finite state machine. That work happens in my personal fork at https://github.com/danielnaber/languagetool/commits/new_engine

milekpl commented 11 years ago

OK, changing the flag on this...

milekpl commented 10 years ago

Daniel, as you can see in the repo, I thought I had a cleaner implementation. But I don't, and I didn't solve the "min="2" problem.

However, after thinking a bit, we may use a similar strategy as we use with < or > and phrases: simply create n rules in memory for min=n, by repeating token n times in the rule, and appropriately renumbering any match elements and references? This is basically the strategy used by DFA/NFA algorithms when compiling {n,m} to automata, and it seems to be the easiest. I fixed the reason for failing tests in the PatternRuleMatcherTest (it was a very easy bug), so that's the only remaining problem, right?

danielnaber commented 10 years ago

Yes, that seems to be the easiest way. It would be good if you can give it a try.

milekpl commented 10 years ago

OK, will do.

milekpl commented 10 years ago

Seems to work fine right now, and I don't see any remaining bugs (more tests are now in demo rules, as we deal with "min" via XML loader). One thing that I do not understand is why we have multiple rule matches for max > 1. Is that on purpose?

danielnaber commented 10 years ago

I don't remember... maybe the fact that I couldn't get that working was one of the reasons I limited it to 1 in the first place. But it seems wrong.

milekpl commented 10 years ago

Well, you did not limit "max"... I will look into that then.

milekpl commented 10 years ago

OK, it works, but as we could have some GSoC students, I'll simply add issues for these fixes (they are quite trivial).