Cay-Zhang / RSSBudRules

Schema documentation for and primary host of RSSBud rules, a superset of RSSHub Radar rules.
MIT License
23 stars 1 forks source link

Complete English translation (完整的英文翻译) #1

Closed schaafjs closed 1 year ago

schaafjs commented 2 years ago

(Chinese translation below)

Hi there, thank you very much for developing and providing this app. Would you be able to fully translate it to english? I have attached two screenshots, one showing missing translations for the different feed types and one showing missing translations in the "Shortcut Workshop" section in the settings.

Missing translations for different feed types

Missing translations in "Shortcut Workshop"

If I can be of any assistance during translation or for proofreading, feel free to ask!

Thank you very much.

Chinese

你们好。 非常感谢你开发和提供这个应用程序。你能不能把它完全翻译成英文?我附上了两张截图,一张显示不同饲料类型的翻译缺失,一张显示设置中的 "快捷工作室 "部分的翻译缺失。

缺少不同类型的翻译

"捷径工作室 "中缺少的翻译

如果我在翻译过程中或在校对方面能提供任何帮助,请随时提出!

非常感谢你。

Cay-Zhang commented 2 years ago

Hey,

English translation for the shortcut workshop should come with v2 which is still under development (very slow though).

For RSSHub feeds, I don't really have any control over translations since RSSBud relies on the rule file provided by the RSSHub project (learn more about it here). I did propose some solutions for translating RSSHub rule titles in their Telegram group but sadly nothing came out of it.

By the way, in v2 the path for an RSSHub feed will be shown in conjunction with its title so hopefully, non-Chinese speakers could understand its meaning a little better.

schaafjs commented 2 years ago

Hi,

thanks for your reply.

Couldn't I also just localize the file you linked? Since it's JSON, one could just automatically translate the title property, yielding proper English localization of the displayed rules. This would mean that you'd have to be able to set a custom URL for the rules, but apart from that it should work, right? Localization should be possible via GitHub Actions plus some translation API. The one offered by DeepL has a quite generous free tier of 500k characters a month, for example.

If you'd consider making the rules URL a custom setting, I can have a look at the translation part.

While this doesn't really solve the whole issue at hand and wouldn't directly contribute to the upstream project, I still think it is the best bet for now.

Cay-Zhang commented 2 years ago

Yes, I am planning on supporting custom rule sources, with an expanded schema and support for multiple rules!

Having a different rule file automatically generated with translated titles sounds pretty cool since it doesn't require a schema change from the RSSHub side. It would be much appreciated if you could help create a GitHub action that translates rule titles into an arbitrary language.

schaafjs commented 2 years ago

It would be much appreciated if you could help create a GitHub action that translates rule titles into an arbitrary language.

I'll do some research into what's the best way to go about this and then look into implementing it.

I think a separate repository to house the initial rules file and the derived translations plus the final GitHub action would probably be the best place, right?

Cay-Zhang commented 2 years ago

Thanks for your help! I've opened up a new repo for RSSBud rules. Feel free to contribute there. I'll be designing the extended schema in the meantime.

schaafjs commented 2 years ago

Sorry for the late reply. I was quite busy in the last week and will take a look at the translations in the coming days.

It would probably make sense to update the ruleset in that repo just like in this one using the update-rules workflow.

Cay-Zhang commented 2 years ago

Thanks for reminding me! Just added the workflow.

schaafjs commented 2 years ago

Hi there,

I can provide a small update to progress on the autotranslation. I have checked out a couple of different translation tools but have not found one that is fully sufficient as of yet. Two somewhat final candidates for now are the following:

For now we have to either wait or program our own solution it seems like.

Considering the later option, we might be lucky using LibreTranslate which has Mirrors that do not require an API key. Would you mind checking out their translation quality? Translations can be made using their web interface.

Btw, it would probably be more fitting to continue this conversation over in the new repo, you can transfer the issue if you wish.

Cay-Zhang commented 2 years ago

Thanks for your efforts! I think our only option now is to program our own script to translate the rules.

I looked at LibreTranslate and their translation quality is rather suboptimal. DeepL does look good, and we can go with the free tier.

Do you have experience in writing custom GitHub Actions?

schaafjs commented 2 years ago

I think our only option now is to program our own script to translate the rules.

Agreed.

Do you have experience in writing custom GitHub Actions?

I do have a little experience, yes.

A couple thoughts on this:

Cay-Zhang commented 2 years ago

I would probably use Python, since it is easy to use as a scripting language and has many libraries which should interact nicely with text or JSON files.

I'll remind you that we are working with JS objects, not JSONs :) Maybe JS/TS is a better fit here.

This should be possible by consulting the Git log and taking a look at the latest changes to the source rules file. I am however open to (better) suggestions.

What about diffing the JS objects directly? A simple search yields some results for JavaScript, but I'm not sure if Python can directly deal with or diff native JS objects.

And then for the output, it might be impossible to output JS objects as their code, so we are back at plain text processing again. I think generating a dictionary of updated tokens to translated tokens and doing a full-text replacement would suffice.

schaafjs commented 2 years ago

I'll remind you that we are working with JS objects, not JSONs :) Maybe JS/TS is a better fit here.

Good call! I have in fact missed that so far, iteration over JS objects should be easy enough though. I am however not very experienced with JS or TS. DeepL does offer a Node.js library for their API: DeepLcom/deepl-node.

What about diffing the JS objects directly? A simple search yields some results for JavaScript, but I'm not sure if Python can directly deal with or diff native JS objects. And then for the output, it might be impossible to output JS objects as their code, so we are back at plain text processing again. I think generating a dictionary of updated tokens to translated tokens and doing a full-text replacement would suffice.

Not sure if I follow completely but does this actually work for all cases? Updating should work, but what about inserting and deleting? If a new object is inserted, it does not exist in the translation dictionary which should be detectable, same thing for deleting an object. This does however require a full pass over both radar-rules.js and the translated dictionary.

Another thing to consider: how are multiple languages handled? Do we add multiple translations for the same object to the object itself or do we want to mantain separate dictionaries for the different languages? IMHO the second option makes more sense since this would result in one rules file per language meaning the only thing someone would have to configure in their client would be the URL of the rules file.

Cay-Zhang commented 2 years ago

I'm sorry for neglecting this for so long; been busy with work recently... But here's some good news: the extended rule schema and multiple rule file support are both implemented on RSSBud's main branch! If you are interested in joining our TestFlight group, feel free to reach out to me on Telegram.

IMHO the second option makes more sense since this would result in one rules file per language meaning the only thing someone would have to configure in their client would be the URL of the rules file.

I completely agree!

So here's the plan I have for translating rules, please let me know if you have better ideas :)

For every target language, we store a dictionary and the translated rule file. When the source rule file updates, we extract all the rule titles and compare them with the keys in our dictionary files. New keys are translated by DeepL and obsolete keys are removed. Using the updated dictionary, we do full-text replacements for each item in the dictionary, swapping the key string literal for the value string literal (e.g. replace all "source" with "target"; this way we make sure that we are not accidentally changing the actual code). This approach makes the dictionary editable so we can refine the translations if we want.

I will try to implement this strategy with Node.js in the next few weeks.

schaafjs commented 2 years ago

I'm sorry for neglecting this for so long

No need to apologize, you're spending time on an issue that I'm having in your spare time! 😊

So here's the plan I have for translating rules, please let me know if you have better ideas :)

That sounds reasonable and should work. Using the dictionary as a caching strategy is an excellent idea to avoid going over the free API quota! Adding each translation to the dictionary once also means that translation errors can be fixed manually without being overwritten in the next update. 👍

I will try to implement this strategy with Node.js in the next few weeks.

No pressure. If I can help with testing, feel free to let me know. I'll contact you on Telegram for the TestFlight group access. Done.

Cay-Zhang commented 2 years ago

The translation mechanism as described in my last comment is done! If you are on the latest TestFlight version, you can now go to Settings -> Rules, change the remote URLs of the rules to the following, and tap Save and update:

https://raw.githubusercontent.com/Cay-Zhang/RSSBudRules/main/rules/en-US/radar-rules.js https://raw.githubusercontent.com/Cay-Zhang/RSSBudRules/main/rules/en-US/rssbud-rules.js

The dictionaries are in the dicts folder, so you can also submit PRs for translation improvements now!

I'll set up some GitHub Actions next :)

Cay-Zhang commented 2 years ago

GitHub Actions are set up 🎉! Rule fetching and translation should happen every 1 hour or whenever RSSBud rules have changed.

schaafjs commented 2 years ago

Thank you so much for implementing this! I have been using it the last couple of weeks, and it works perfectly. I will look into improving some of the translations when I get the time.