Open drdhaval2785 opened 9 years ago
minor and rare samAsa types are more amenable to rule based tagger and the major and common samAsa types are more amenable to statistical
Too bad the links die so fast, so here is the PDF.
prepare a preclassifier which can weed out the samAsas which can be deterministically tagged by rules.
I guess even after 6 years we are not still there.
This paper highlights that the minor and rare samAsa types are more amenable to rule based tagger and the major and common samAsa types are more amenable to statistical (for our purpose ANN based) taggers. The paper also outlines various pANini rules and their coding implications.
Therefore, we should prepare a preclassifier which can weed out the samAsas which can be deterministically tagged by rules. We should remove such samAsas from both training and evaluation data. So our ANN would be trained for major types (common ones). Maybe some % classification may increase by this method. Will have to test.