robert-bor / aho-corasick

Java implementation of the Aho-Corasick algorithm for efficient string matching
Apache License 2.0
890 stars 348 forks source link

Build order calls produce different scan-results #95

Open Alien2150 opened 2 years ago

Alien2150 commented 2 years ago

Putting "addKeywords" in different code spots produces different results. Here is JUnit Test that produces different results with 0.6.3. The Second test fails:

class TrieTest {
    @Test
    fun `works`() {
        val trie = Trie.builder()
                .ignoreCase()
                .stopOnHit()
                .onlyWholeWords()
                .addKeywords(listOf("WhatsApp", "Ban me please"))
                .build()

        assertFalse(trie.parseText("Ban me please. I appreciated it").isEmpty(), "failed 1")
        assertEquals(1, trie.parseText("Call me on WhatsApp. Here is my number").size, "failed 2")
    }

    @Test
    fun `bug-report`() {
        val trie = Trie.builder()
                .addKeywords(listOf("WhatsApp", "Ban me please"))
                .ignoreCase()
                .stopOnHit()
                .onlyWholeWords()
                .build()

        assertFalse(trie.parseText("Ban me please. I appreciated it").isEmpty(), "failed 1")
        assertEquals(1, trie.parseText("Call me on WhatsApp. Here is my number").size, "failed 2")
    }
}
Alien2150 commented 2 years ago

This was not happening on 0.4.x