英文全匹配配置未生效

hangker1997 commented 3 months ago

我的配置类是, 采用黑白名单的方式控制敏感词, 只用到了黑名单, 黑名单里全是英文的, 开启了这个全匹配配置后发现还是不生效, 比如黑名单里有cp, 现在cpm是合法的,但还是被校验住了,不知道咋回事, 我用的版本是 0.14.0
@Bean public SensitiveWordBs sensitiveWordBs() { return SensitiveWordBs.newInstance() .wordAllow(WordAllows.chains(WordAllows.defaults(), myDdWordAllow)) .wordDeny(myDdWordDeny) //英文全匹配 .wordResultCondition(WordResultConditions.englishWordMatch()) // 各种其他配置 //不忽略全角和半角中英文括号 .ignoreWidth(false) .init(); }

hangker1997 commented 3 months ago

已解决,问题原因是全匹配1位的话不好用就像cp是敏感词,cpm还是会校验为敏感词,但是cp后面跟两位以上就不会被校验敏感词我重写AbstractWordResultCondition的doMatch方法就解决了 import com.github.houbb.heaven.util.lang.CharUtil; import com.github.houbb.sensitive.word.api.IWordContext; import com.github.houbb.sensitive.word.api.IWordResult; import com.github.houbb.sensitive.word.constant.enums.WordValidModeEnum; import com.github.houbb.sensitive.word.support.resultcondition.AbstractWordResultCondition;

public class EnglishWordMatch extends AbstractWordResultCondition {

@Override
protected boolean doMatch(IWordResult wordResult, String text, WordValidModeEnum modeEnum, IWordContext context) {
    final int startIndex = wordResult.startIndex();
    final int endIndex = wordResult.endIndex();

    // 判断前一个字符是否为英文。如果是，则不满足
    if(startIndex > 0) {
        char preC = text.charAt(startIndex - 1);
        if(CharUtil.isEnglish(preC)) {
            return false;
        }
    }

    // 判断后一个字符是否为英文
    if(endIndex < text.length()) {
        char afterC = text.charAt(endIndex);
        if(CharUtil.isEnglish(afterC)) {
            return false;
        }
    }

    // 判断当前是否为英文单词
    for(int i = startIndex; i < endIndex; i++) {
        char c = text.charAt(i);
        if(!CharUtil.isEnglish(c)) {
            return true;
        }
    }

    return true;
}

}

houbb commented 2 months ago

感谢提醒，v0.19.1 版本已修正。后续这种优化可以提 PR，我来统一合并。

houbb / sensitive-word

英文全匹配配置未生效 #69