priyanshu2103 / Sanskrit-Hindi-Machine-Translation

Machine Translation from Sanskrit to Hindi using Unsupervised and Supervised Learning
16 stars 12 forks source link
fasttext-embeddings hindi machine-translation monolingual-corpora parallel-corpus sanskrit sanskrit-english

Sanskrit-Hindi-Machine-Translation

Analysing different techniques of Sanskrit-Hindi Machine Translation

The project report is attached as NLP_Final_Report_Group_2.pdf

The Sanskrit-English parallel data and Sanskrit-Hindi parallel(test) data is present in parallel-corpus folder Parallel data consists of Ramayan, Rigveda, Bhagvad Gita, etc. The Sanskrit Monolingual Data is available at https://drive.google.com/file/d/1_qclc7unNLvToiDK8t2scIgj5oxJDEGm/view?usp=sharing

The Sanskrit and Hindi fasttext embeddings created using the data we collected are available at https://drive.google.com/file/d/1k5INFw9oaxV7yoWRg0qscmcFrOHVhdzW/view?usp=sharing and https://drive.google.com/file/d/1Md9N7Ux2P9JCky1_9RgL2KjXRGb_lpXj/view?usp=sharing respectively.

Unsupervised MT

Run Unsupervised_MT.ipynb

Code files for this are also available at https://github.com/priyanshu2103/UnsupervisedMT.git

Supervised SMT

Run Supervised_Statistical_MT.ipynb

Supervised NMT

Run Supervised_Neural_MT.ipynb