frozentoad9 / CMST

Prabhupadavani: A Code-mixed Speech Translation Data for 25 languages
Apache License 2.0
13 stars 1 forks source link

Request for a English-Hindi-Sanskrit code mixed dataset #2

Open RahulSundar opened 5 months ago

RahulSundar commented 5 months ago

Dear Authors, Came across this work when I was looking for codemixed data sets for ST workloads. I am specifically looking for English, sanskrit and Hindi code mixed datasets..If such a dataset is not available already, would like your inputs on this if possible to create one. Regards, Rahul Sundar

Jivnesh commented 5 months ago

Our Prabhupadvani dataset contains English, Sanskrit and Bengali. I am not aware if any souce also has Hindi with Sanskrit and English.

On Sat, 1 Jun, 2024, 5:48 pm Rahul Sundar, @.***> wrote:

Dear Authors, Came across this work when I was looking for codemixed data sets for ST workloads. I am specifically looking for English, sanskrit and Hindi code mixed datasets..If such a dataset is not available already, would like your inputs on this if possible to create one. Regards, Rahul Sundar

— Reply to this email directly, view it on GitHub https://github.com/frozentoad9/CMST/issues/2, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADCLGXZJZD4RJIWCEWEI32TZFG3ZVAVCNFSM6AAAAABIUDOPPOVHI2DSMVQWIX3LMV43ASLTON2WKOZSGMZDSMJRGE4DAMQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>