miniflux / v2

Minimalist and opinionated feed reader
https://miniflux.app
Apache License 2.0
7k stars 732 forks source link

Language Translation API #1015

Open apayne opened 3 years ago

apayne commented 3 years ago

There are occasional RSS feeds in languages that are not the native language of the user. Often, the articles in the feeds do not offer translations, leaving the user of miniflux in a situation where text needs to be copied into a translation service, usually one that is online.

Has the author ever considered placing a "hook" to translate feeds into whatever the language of the user is in? This would come with two pieces:

  1. An API hook, which would interact with the translation service, sending text to be translated and receiving back text in the language native to the user account. This could be either a web-based service, or more importantly, a local or self-hosted service (which allows the reader to maintain their privacy).

  2. A secondary display mode, consisting of two columns. The first column would be the original article text, so that the author's original content is preserved for inspection. This is important, as machine translation services are not always perfect and having the original text allows the user to spot mistranslated words, or gain nuance into what is being said. The second column would be the translated text in the language selected by the user. This directly maps to the user's language setting.

Beyond those changes, I don't think much else would change. The user would see the article with both columns (left side being in the original language, right side being translated). The articles would continue to be retained like they always have, and the translation would occur on-demand.

If there are potential limits to consistently translating articles on-demand, perhaps a button with "Translate" would be present that would manually trigger the API call to the translation service; this would keep users from running up billings on commercial services that translate by the session, paragraph, etc. etc.

mredaelli commented 3 years ago

I would also be interested in something along this lines, as I have to follow feeds in languages I don't know.

I personally don't think it necessary to have a dual-column display (would actually dislike it: on my android it would be probaby borderline-unusable, plus what about Fever?), and would be happy with storing the translated version (if I want the original I can just click on the link).

This can be accomplished with the existing infrastructure by a (series of) translation rewrite rules and some settings for the translation services.

Go is not my language of choice, but would be happy to try my hands if this approach makes sense.

apayne commented 3 years ago

Hi @mredaelli, thanks for the follow-up comment. I wanted to clarify the "two column" part of the proposal and perhaps provide some additional clarification.

The only reason it exists in the proposal is to ensure that the translation quality is adequate. Admittedly the "highlight a sentence on one side and see the translated sentence on the other side" feature of translate.google.com drives some of this discussion when it comes to the "wide" dual-column layout. I do agree that on a "tall" format display like a mobile device this would be totally inadequate, but for a "wide" display like a regular computer, having side-by-side comparison helps ensure you're really reading what is being written in a language you don't speak. I've compared translations between Google translate and other services, and on rare occasion I sometimes encounter significant differences in the results. A common source of translation fun is to put a sentence into Google translate, translate it into Latin, then translate back into English, then back into Latin...lather rinse repeat. The resulting hilarity from the escalating/cascading errors in translation showcases how you sometimes need to have the original alongside the translation to ensure that not only the words, but the intent, of the text is kept more or less intact.

The method for verification can be something else entirely; perhaps we can replace the dual-display with this amended request instead:

  1. an additional pull-down at the top of the article that allows the following selections:
    1. Display original text as stored in the database (example: it would say "Original (es_PT)" if the article is in Portuguese, or some other similar indicator of the language used by the original article)
    2. Select one web-based translation service to translate the article to the user's default language in their miniflux profile (example: "Translate via Google")
    3. Selection of one self-hosted translation service to translate the article to the user's default language in their miniflux profile (example: "Translate via LibreTranslate")
    4. In the event that there is only one translation API set up in the profile, only that service would be displayed
    5. In the event there is no translation API available, only the "Original" option would remain.

This would eliminate the need for the dual-display in the original proposal, allow the user a choice of either web or self-hosted service, and the miniflux author only needs to add two API hooks, one for web translation, one for local translation.

The translation service, however, needs to be a separate entity. It is important from two aspects: the first being minimal impact on the codebase of miniflux itself, keeping with the author's intent of focused development. The second aspect is to give the end-user freedom to use the service they want, which could be web-based, self-hosted, or some other arrangement.

versun commented 6 months ago

maybe you can try https://github.com/rss-translator/RSS-Translator It can translate a feed and generate translated feeds.