deeppavlov / DeepPavlov

An open source library for deep learning end-to-end dialog systems and chatbots.
https://deeppavlov.ai
Apache License 2.0
6.69k stars 1.15k forks source link

What are the steps for creating another language support (eg: Korean)? #1117

Closed Eugen2525 closed 4 years ago

Eugen2525 commented 4 years ago

Hi,

I am considering creating language support for the Korean language, and I am interested in what are the requirements, steps I need to take?

Could you please elaborate on the steps and also on the feasibility of doing so?

Thanks! ``

yoptar commented 4 years ago

Hi @Eugen2525, There are at least two problems when creating solutions for established tasks on new languages: datasets and tokenization. What tasks are you interested in? We do have two multi-language models you can take a look at: https://demo.deeppavlov.ai/#/mu/textqa

Eugen2525 commented 4 years ago

Dear @yoptar Thanks for a prompt reply, I wanted to create chat bot just like the default one provided which runs on the Cambridge restaurant dataset.

So, I am interested what steps should I take to make a chatbot for Korean? I know for sure I need a dataset like the Cambridge restaurant dataset for the training of the bot, but what are some other things needed to be done?

Could you please elaborate...

yoptar commented 4 years ago

We do have two tutorials for that: basic and extended. Almost everything in those should work for Korean. But the basic config uses embeddings for English, so you can try and replace that.

gasyoun commented 4 years ago

basic config uses embeddings for English

Nothing with Russian? What's the quickest way to build a VK and FB bot?

Eugen2525 commented 4 years ago

I am not sure if you mean if there is a Russian language support (which I saw there is), but for Korean, I just translated datasets and it worked.

For the quick prototyping you can select any dataset and just try and see with the mentioned approach. This is what I did at least