alar-dict / alar.ink

dictmaker site theme for alar.ink (Kannada-English dictionary)
https://alar.ink
MIT License
39 stars 12 forks source link

Fix trailing numbers/period at the end of kannada words in result definitions which create a faulty search query #5

Closed kaushiksk closed 3 years ago

kaushiksk commented 3 years ago

Issue

Some kannada words in the results definition will have a period sign or number at the end. Currently these become part of the hyperlink as well as the search query, and the results fetched don't match the original word due to these numbers/period sign at the end.

Example:

In the results for ಲೊಳ್, you'll notice some kannada words in results definition have a period or a number at the end. In this case it is "ಲಾಳ1" and "ಲೋಳಿಸರ."

These numbers and period signs creep into the resultant query (as the link contains the same text). The result for "ಲೋಳಿಸರ." are quite different from "ಲೋಳಿಸರ".

Similarly for "ಲಾಳ1" and "ಲಾಳ"

Fix

The current isKannada function is actually checking if the input word has a kannada character, it will return true when any one character in input word is a kannada character - hence renamed it to hasKannadaChar.

After splitting a definition into individual words, when we have a word which has a kannadaChar, we remove all non-kannada characters from the word and use the cleaned word in both the href and the text within the a tag. The trailing non-kannada characters are simply added as a text node at the end of the parent span.

For now this should fix the issue, and separate the trailing number/period from the actual kannada word. But this is probably an issue in the dataset itself where some spaces might be missing, and that probably needs to be cleaned.

I've tested this within the browser in Edge Dev.

kaushiksk commented 3 years ago

@knadh This fixes an issue introduced as part of this commit: https://github.com/alar-dict/alar.ink/commit/9c90fbaff0b41df6ab94aea54283f4155564a9fa