Closed ian-flores closed 4 years ago
that sounds like a good idea Ian! We should have an extra argument to only allow for exact matches though so we could detect typos when used in combination with the Rmarkdown yaml term list that Greg envisioned.
I'll go ahead, implement it and send a PR. I used cosine similarity because it was the one that seemed to work best when adding punctuation signs or weird typos, but I'm open to change the metric as well.
We need to implement fuzzy matching or string distance to search for the nearest word similar to the slug if there isn't an exact match.
Right now if we run:
We get:
Because it is expecting the
data_frame
slug and notdata frame
. But the define function should be able to see that this is a very near match and thus we should present this definition. I do this using cosine similarity in the Python version