keshav22bansal / BAKSA_IITK

Official implementation of paper - "BAKSA at SemEval-2020 Task 9: Bolstering CNN with Self-Attention for Sentiment Analysis of Code Mixed Text" accepted at Proceeding of the 14th International Workshop on Semantic Evaluation.
19 stars 2 forks source link

Query on Transliteration #2

Closed graviraja closed 4 years ago

graviraja commented 4 years ago

Hi, Thank you for the amazing work. I have a small query. How is the transliteration done? I read in your research paper (Bolstering CNN with Self-Attention for Sentiment Analysis of Code Mixed Text) that you have used Google's Transliteration tool. I have used the Indic Transliteration module in python for conversion and the results are not great compared to the data present in train.txt or test.txt file. How is it done using google's transliteration? Is it a manual process or some automation is there?

Few examples:

Data in train.txt

Using Indic Transliteration

Thanks for the help!

Harshagarwal19 commented 4 years ago

Hi @graviraja, thanks for your appreciation.

As mentioned in the paper, we have used Google's Transliteration API. And it's an automated process. We have provided the code for it in utils/transliterate.py. Please refer to it. Documentation of API used is available here.

graviraja commented 4 years ago

Thank you for the quick response @Harshagarwal19. I will look into it. Keep up the good work :)