Open tshrinivasan opened 3 days ago
We need a tool to find the root words of any given tamil word.
for example - > கம்பரிடம் -> கம்பர், சென்னையில் > சென்னை, மரத்தின் -> மரம்
For that, we need the tamil rules for சேர்த்து எழுதுக, பிரித்து எழுதுக.
@rdamodharan has written a algorithm for tamil for snowball stemmer. Open-Tamil python library has a python implementation.
But, it is not a perfect one. check a online demo here. https://mazko.github.io/jssnowball/
Check the algorithm for tamil in c here. https://github.com/snowballstem/snowball/blob/master/algorithms/tamil.sbl
https://github.com/rdamodharan/tamil-stemmer/blob/master/docs/stemmer.png
https://github.com/rdamodharan/tamil-stemmer/
https://mazko.github.io/jssnowball/
What we have to do?
The current tamil stemmer online demo is here- https://tamilpesu.us/en/stemmer/
it is based on https://github.com/rdamodharan/tamil-stemmer/
We need a tool to find the root words of any given tamil word.
for example - > கம்பரிடம் -> கம்பர், சென்னையில் > சென்னை, மரத்தின் -> மரம்
For that, we need the tamil rules for சேர்த்து எழுதுக, பிரித்து எழுதுக.
@rdamodharan has written a algorithm for tamil for snowball stemmer. Open-Tamil python library has a python implementation.
But, it is not a perfect one. check a online demo here. https://mazko.github.io/jssnowball/
Check the algorithm for tamil in c here. https://github.com/snowballstem/snowball/blob/master/algorithms/tamil.sbl
https://github.com/rdamodharan/tamil-stemmer/blob/master/docs/stemmer.png
https://github.com/rdamodharan/tamil-stemmer/
https://mazko.github.io/jssnowball/
What we have to do?