alicia-ziying-yang / conTEXT-explorer

ConTEXT Explorer is an open Web-based system for exploring and visualizing concepts (combinations of occurring words and phrases) over time in the text documents.
Apache License 2.0
9 stars 3 forks source link

[wish list] Fuzzy matching #5

Open baileythegreen opened 3 years ago

baileythegreen commented 3 years ago

A more complicated feature that might be useful is implementing fuzzy matching. (Note: this might be really difficult to do; I'm not sure, and it probably depends on the structure of the software.)

The idea would basically be: if I'm interested in occurrences of the word 'marriage', but a document has this misspelled in at least one place (as, say, 'mariage'), the current search won't find that, so the statistics might be affected. Fuzzy matching could potentially solve the problem.

This is just an idea I thought I would share.

JOSS Reference: openjournals/joss-reviews#3347

alicia-ziying-yang commented 3 years ago

Hi @baileythegreen , We are using whoosh search engine to index and search the corpus, it does not support fuzzy matching by itself. But automatically correct the input words requires big efforts if we develop it only for this app. So we may not spend too much time on this issue this time.

Thanks for sharing this idea:)