Open KMax opened 9 years ago
Also why not to translate the descriptions of the products? :)
Unfortunately, yandex.translate API is limited:
the volume of the text translated: 1,000,000 characters per day but not more than 10,000,000 per month. http://legal.yandex.ru/translate_api/
1,000,000 characters per day is not sufficient for translating names and descriptions.
I misunderstood this limit, I saw "requests" instead of "characters" (facepalm)
The limit allows us to translate names only for ~20,000 items.
I see few possible options:
As a workaround, I suggest to write a script that queries the names and descriptions which don't have translations yet and translates them with the API till faces the limit. The script can be started manually, so we could run it once a day till the all products are translated.
Also the script could write the translated triples in a file, so we could reuse the translations later.
In future, we shouldn't update the whole dataset, but only changed pieces, therefore we may won't exceed the limit.
Do we really need Russia-specific products to be translated? All products sold here can't be found outside Russia without export from Russia and import procedures afterwards which includes label translation (something like a white sticker with description in language of destination country). Moreover, proper nouns and nominals are normally not translated. They're transliterated, transcribed or undergo a loan translation instead, which won't be implemented by yandex api. It's sensible to store descriptions of food stuff in native language, thus Russian products should have descriptions in Russian (if no other manufacturer description is provided) and imported goods should be described in manufacturer country language. Otherwise we'll get something like Иван -> John.
Maxim(@m-lapaev), from my pov, the goal is to allow non-Russian foodpedia's users to understand at least something, when they open a page with product. We want to publish an article to non-Russian journal and it looks awkward, that we show everything on Russian
Maxim(@KMax), in other words, we can organize separate updatable vocabulary (dataset) with ru-en translations. That may works.
@chistyakov is right, we know that the machine translations won't be as good as we would like, but it's much better than nothing.
So, actualy the point is to present contents, categories and other data in English, but not food names. Just imagine a translation of some food name into Russian, let's say, German beer "Berliner" --> "Берлинец". We have a risk to produce something like [1] or [2].
@m-lapaev if you look at the first message in this thread, you will see what do we mean by "name". And I accept the risk to have really bad translations for some products.
partially translated dump is uploaded to production: http://foodpedia.tk/page/4600209002117
only first 20,000 items were translated
example of bad translation: http://foodpedia.tk/page/8436018292830 http://foodpedia.tk/page/4607105861152
One of our goals is to provide multilingual support. One of step to achieve this goal is to automatically translate names of the products using an external API, such as Yandex.Translate or any other.
An example:
МОЛОЧНАЯ КАША "МАЛЫШКА". МУЛЬТИЗЛАКОВАЯ СО СМЕСЬЮ ФРУКТОВ 250Г
MILK PORRIDGE "BABY". MULTISECULAR WITH A MIXTURE OF FRUIT 250G
Of course automatic translation isn't as good as translation provided by the manufacturer, but at this moment it's better than nothing.