RasaHQ / pokedex-demo

Rasa Demo for a digital assistant for pokemon
Apache License 2.0
36 stars 39 forks source link

Added action_translate_name to enable translations #21

Closed juandes closed 3 years ago

juandes commented 4 years ago

This PR, which involves issue #8, is the first try at translating Pokemon names. At the time being, only one Pokemon—Snorlax—has its French, German, and Spanish name variant in the database file.

The translation features use a custom action. Below is an example. Note that the language name is in lowercase.

Your input ->  What is the french name of Snorlax?                                                                                                                                        
Snorlax's french name is Ronflex.
Your input ->  What is the spanish name of Snorlax?                                                                                                                                       
Snorlax's spanish name is Snorlax.
Your input ->  What is the german name of Snorlax?                                                                                                                                        
Snorlax's german name is Relaxo.
Your input ->  What is the dutch name of Snorlax?                                                                                                                                         
I don't know that translation.
sara-tagger commented 4 years ago

Thanks for submitting a pull request 🚀 @chkoss will take a look at it as soon as possible ✨

koaning commented 4 years ago

@juandes nice. This is certainly in line with how I wanted to handle this but we should perhaps discuss a few caveats.

  1. How hard is it to get the full pokemon name list for all the languages? Snorlax is a good start but we'd preferably also have support for other languages.
  2. How might we handle the situation if the chatbot receives a german name and we'd like to get a japanese one? Part of me wonders if this is best handled in the does_pokemon_exist action but I'm curious to hear your thoughts on it.
  3. To my knowledge the Dutch never made their own translations and always use the english version. I can also imagine that the number of possible translations is limited so we might be able to have a list of languages in a lookup table.
juandes commented 4 years ago

Hi @koaning , you have very valid points. These are my answers.

  1. How hard is it to get the full pokemon name list for all the languages? Snorlax is a good start but we'd preferably also have support for other languages.

Do you mean having all the translated names of each Pokemon? It is possible, but won't be trivial. I couldn't find any dataset listing all the languages (except for Japanese) so maybe I'll have to crawl over Bulbapedia (Pokemon wiki) and do my own Done (see comment below).

  1. _How might we handle the situation if the chatbot receives a german name and we'd like to get a japanese one? Part of me wonders if this is best handled in the does_pokemonexist action but I'm curious to hear your thoughts on it.

My first thought would be to extend the intent:name_translation entities so that instead of having an entity language we could have a given_language (German in this case), and a target_language (Japanese). This way we could use the action I'm proposing here.

  1. To my knowledge the Dutch never made their own translations and always use the english version. I can also imagine that the number of possible translations is limited so we might be able to have a list of languages in a lookup table.

The lookup table is a cool idea. I'll add it to this PR.

juandes commented 4 years ago

EDIT: I found most translations here https://github.com/sindresorhus/pokemon/tree/master/data. I'll re-do the PR :D

EDIT 2. I added the German and French names to the whole dataset. Also, now there is a language lookup file used to check if the bot supports a language. These are new examples:

Your input ->  What is the french name of Snorlax?                                                                                                                                        
Snorlax's french name is Ronflex.
Your input ->  What is the spanish name of Snorlax?                                                                                                                                       
I do not yet support that language.
Your input ->  Translate Mewtwo to german.                                                                                                                                                
Mewtwo's german name is Mewtu.
koaning commented 4 years ago

I like the extra dataset, but have you checked the license of the project where you got the data from? Either way, it would be good to mention it in the readme as a source.

I guess we still have the japanese/koreon/russian names. It'd be cool to have those but I'd prefer we translate these first into unicode so folks who understand English can read them phonetically. It's fine to have that be a seperate PR though.

juandes commented 4 years ago

The project uses an MIT license (https://github.com/sindresorhus/pokemon/blob/master/license) :)

Regarding other languages, I totally agree with you. I'll see if I can find a dataset with the name translated to Latin letters. Also, I'm not using the language datasets as they are, I simply added their data to our already existing JSON file (just to clarify). Thanks!