Open avidale opened 1 year ago
Hi, thanks for your suggestion ! Currently, you can use the dataframe and check for the presence of some languages in the names. But it's not enough, some datasets have the language in a particular column that is removed by the preprocessings. So it's not great, I agree. Proper language handling is in my roadmap.
Yes, adding the languages id to the dataframe would be a great first step.
Another potential enhancement is to make the file recast.py
localizeable, so that the user could provide the prompt templates in the chosen language instead of the default (English).
Currently, the package doesn't allow choosing the language. I think many people who are developing models for specific languages (or language sets) would like to be able to access task data for a given language, so if you implement this functionality, it might be of a great help.