redpen-cc / redpen

RedPen is an open source proofreading tool to check if your technical documents meet the writing standard. RedPen supports various markup text formats (Markdown, Textile, AsciiDoc, Re:VIEW, reStructuredText and LaTeX).
https://redpen.cc
Apache License 2.0
563 stars 74 forks source link

Improve Okurigana feature #853

Closed Toshihiko-Yamazaki closed 5 years ago

Toshihiko-Yamazaki commented 5 years ago

Hello, I'm considering the improvement of Okurigana feature. Please confirm whether the reviced code is acceptable or not.

[Improved feature] Improve Okurigana feature

[Issue] If the verb of "消費する" is included with the sentnce, The unexpected issues will be detected as 不正な送り仮名.

[Measure] The unexpected issues won't be deteced if the existing code is replaced with the reviced code.

The existing code

static {
    invalidOkurigana = new HashSet<>();
    invalidOkurigana.add("費さ");
    invalidOkurigana.add("費し");
    invalidOkurigana.add("費す");
    invalidOkurigana.add("費せ");
    invalidOkurigana.add("費そ");

}

The reviced code

static {
    invalidOkuriganaTokens.add(new ExpressionRule().addElement(new TokenElement("費さ", asList("動詞", "自立"), 0)));
    invalidOkuriganaTokens.add(new ExpressionRule().addElement(new TokenElement("費し", asList("動詞", "自立"), 0)));
    invalidOkuriganaTokens.add(new ExpressionRule().addElement(new TokenElement("費す", asList("動詞", "自立"), 0)));
    invalidOkuriganaTokens.add(new ExpressionRule().addElement(new TokenElement("費せ", asList("動詞", "自立"), 0)));
    invalidOkuriganaTokens.add(new ExpressionRule().addElement(new TokenElement("費そ", asList("動詞", "自立"), 0)));
}

If the reviced code is conducted, the behavior is the folowing.

The detected issues 費した 費す Not the deteced issues 消費する 費やす

I think there are similar 26 patterns like the above. 26 patterns make them improve the reslt.

coveralls commented 5 years ago

Coverage Status

Coverage increased (+0.09%) to 91.403% when pulling d8171f88019372058fe035d0b9e9a0dd54352033 on Toshihiko-Yamazaki:master into 57bbef7c7cd986c7a0d54b83c654698537e9ce47 on redpen-cc:master.

takahi-i commented 5 years ago

Thank you very much for the valuable contributions!! @Toshihiko-Yamazaki

Toshihiko-Yamazaki commented 5 years ago

Thank you very much for confirming and merging my code. I have been working on my work to take advantage of RedPen. I’ll continue to consider how to use RedPen efficiently.