sanskrit-lexicon / COLOGNE

Development of http://www.sanskrit-lexicon.uni-koeln.de/
18 stars 3 forks source link

Add Ajax AutoComplete #9

Closed Shalu411 closed 7 years ago

Shalu411 commented 10 years ago

Namaste Encoding is indeed a big issue;
While typing for a word to search, we hit on different spellings. (not knowing the exact encoding.. sometimes difficult for beginners) So why not give "suggestions" while typing for the word in search box? The words which are near to the searched word- could be given; at least if not for each letter.. may be for whole word. Eg. "ziva" could be Shiva, Siva, shiva, Shiv, ziv, Siv etc. Though in Devanagari one can be saved from these issues, it is needed in other encoding. Thankyou.

funderburkjim commented 10 years ago

How is one saved from these issues in devanagari? Is it 'विष्णु' or 'विश्नु' (first one, of course) - isn't this the same problem as is it viSNu or viznu ?
Another issue is that there are different spelling conventions among dictionaries; for example, Wilson uses spelling कर्म्मन् whereas MW uses कर्मन्. ' It is remarkable how Google can give suggestions; for instance I typed in a Google search 'crysan' and was given the suggestion 'Chrysanthemum' which was what I was thinking about - amazing. How can a good suggestion system be devised for Sanskrit? Having access to this would be a boon to anyone using a Sanskrit dictionary, whether novice or experienced. This is one of those non-trivial problems. Maybe you have some specific ideas of the ingredients in a useful Sanskrit suggestion algorithm?

gasyoun commented 10 years ago

Wilson spellings are outdated, he has 1500 words with double letters where later dictionaries have none. See https://groups.google.com/forum/#!topic/bvparishat/Eyz0lSNDk-s So I would suggest to takes words from PWK or MW. Or my combined list of all words from all Cologne dictionaries, around 270 000 words. This is not the hardest task in the list for sure. A few links: 1) Google suggestions explained http://searchengineland.com/how-google-instant-autocomplete-suggestions-work-62592 2) Too bad Cologne has not Java https://github.com/fmmfonseca/completely 3) http://jqueryui.com/autocomplete/ is what I would go for Cologne.

Shalu411 commented 10 years ago

"Is it 'विष्णु' or 'विश्नु' (first one, of course) - isn't this the same problem as is it viSNu or viznu ?" People using Devanagari still have some picture of that letter in their eye.. :) It is Roman diacritics, or encodings having capital-smalls, fzxqw letters representing devanagari; that really confuse. "It is remarkable how Google can give suggestions;" I saw You tube is better in giving suggestions. Even Indian language spellings are given in so many variations, that lead me to wonder many a time. Eg- I search for some word having "u", then it also gives "oo" spelt words. Something similar could be thought of. Problem areas like "s", "d" etc. where diacritical marks come into picture, need more suggestions. They are in all- ṁ, ṇ, ṅ, ñ, ś, ṣ, ḍ, ṭ, ū, ī, ā, ḷ, ṛ, ḥ - So these need concentration. Hope this helps. :)

gasyoun commented 10 years ago

"oo" is great for Hindi, one could hardly need it for Sanskrit words. So I think it would be a nice addition, but not critical.

drdhaval2785 commented 9 years ago

http://jqueryui.com/autocomplete/ is what I have used for my tiGanta machine to suggest verbs. The bad part about this approach is that this is fine for smaller number of words e.g. <10000, but trying it for a wordlist of >400000 definitely kills the browser for sure. So, in my opinion the jquery or any browser side application is too difficult for this activity. Some smartly designed AJAX application may be able to help us out.

gasyoun commented 9 years ago

AJAX is JS so it's browser side. Let me do a research once free.

drdhaval2785 commented 7 years ago

One of the most important ones @funderburkjim and @gasyoun.