GeopJr / Tuba

Browse the Fediverse
https://tuba.geopjr.dev/
GNU General Public License v3.0
557 stars 60 forks source link

[Request]: Translate posts #813

Closed TriVoxel closed 4 months ago

TriVoxel commented 7 months ago

Description

Hey, I know this would likely be complex, but I'd love for translation support! Mastodon on the web has this, and it is a really nice thing to have when you have global users speaking foreign languages, whose posts you wish to enjoy.

Perhaps you could use the same library as Dialect, which has the ability to automatically detect languages, and manually override them, as well as source multiple translators?

Proposed design

This my best implementation idea

The existing UI in the upper right corner of posts (three dots) can have a new "Translate" option which, when clicked, opens a small menu (perhaps a small modal with an AdwStatusPage with a clean "translator" logo and brief title and description, similar in dimension to the About page) which contains three subpage buttons including "Translation Service", "Input language", and "Output Language", similar in design to the likes of the GTK About page in how they slide open. The last two options mentioned above contain a searchable list, where the first and default option is "Auto-detect", followed by all the languages supported by the translation engine in alphabetical order. Lastly, there is a pill-shaped "Translate" button at the bottom which is automatically selected when the menu is opened so users can press the icon and then hit enter to auto-translate it. To close this menu, simply press Esc, or the X button on the window title, just like About or Preferences.

Specifics

  1. When the menu is first opened, a blank page with a simple spinner and label such as "detecting language" can be displayed until the engine detects the language, after which, the menu appears with a quick fade animation.
  2. The language lists need a search bar at the top to filter it quickly, which is selected by default for quick and convenient sorting.
  3. After the "translate button is pressed", the original text is replaced (in-line) with the translated text. Of course, the original text needs to still be loaded, so that it can be re-translated if needed.
  4. Additionally, the "auto-detect" should have in parenthesis the detected language. For the source language, it obviously should say what language it detects, and for the output, should have the system language.
  5. I think the input language selection should not be persistent and should always default to auto. This is because the language may vary from post to post, and it is likely that the auto-detected language will be the best fit. The "Translation Service" and "Output language" should be persistent, so that users can pick their favorite service and language, and it will be stored for future interactions. This would eliminate any issues where the user is, ie. using it to learn a foreign language that differs from their system language, or has a preference to the translation engine, or automatically using system source is wrong... user can easily set it back to auto anyways as it would be the top option in the list...

Design theory

The reason I think an AdwStatusPage would work is not only is it pretty, but it would make the functionality intuitive and robust enough to fit most needs, looks good with the GNOME ecosystem, and could be padded enough with the logo and description to make room for the subpages to display enough list items without needing excessive scrolling or resizing. It would also somewhat simplify the development, I think, because it would be easier than a popover menu with changing dimensions, where it might interfere with other functionality. I also think it would be good for mobile and desktop users this way.

This would keep it simple for the user and UI, allowing translation with two button presses, be a clean look, and allow a high degree of user control, or just be tucked away in an unassuming icon

Perhaps for the icon and description, if Dialect's libraries are used, could have a monochrome version of their logo, and label like "Powered by Dialect", with "Dialect" being a link to grab the app from Flathub which might help them get more interest to further improve translation on Linux. Just a thought, though.

TLDR

I think with Dialect already being a native LibAdwaita translator app with good results, the UI could be re-written in a new form factor to be used as a quick, in-app translation utility, using the same libraries the Dialect team has already expertly implemented in another GTK4 app, to prevent duplicated effort and speed up the process.

Additional considerations

As Dialect uses some proprietary translation services such as Google Translate, I think it would be fair to mention this. Perhaps in Preferences, there could be a toggle for "Enable proprietary translation services", which by default is disabled. In the description, there could be a brief mention of the names of the services, with links to the respective privacy policies. Alternatively, we could simply say "Learn more in the 'Legal' page under 'About'", and place that information there in the legal page on the About page of the program, which may even make it more accessible.

By default, only FOSS translators would be present in the UI, unless the user specifically enables the proprietary ones, which may have better accuracy at the cost of privacy, a choice we can leave up to the user. Perhaps we could also place a little symbolic yellow warning badge next to the non-foss options that, when hovered, states "This provider may collect your data. See About/Legal".

Additionally, perhaps a FOSS translation engine should always be used for pre-computing the input language, so the data is only sent to the proprietary providers once the user fully commits to translating the message.

Implementation Details

TriVoxel commented 7 months ago

Side note, I really wish I could program ):

GeopJr commented 7 months ago

Oof that's a lot of implementation designing that's probably going to be left unused unfortunately :sob:

Dialect apart from the proprietary services, provides a self-hosted libretranslate instance. But, making all Tuba users use that could be more bandwidth than expected for the dialect team, so I'd avoid it. I think they are working on integrating the new firefox translation system, if that goes well I wouldn't be opposed to adding it since it happens locally!

Another thing to consider is privacy. You are taking someone's post (that might be private) and sending it to dialect's libretranslate, google translate, deepl etc. that the author might not trust.

The solution is to use Mastodon's api! Translations are provided by the instance and are available for all apps to use, taking care of privacy concerns by only allowing translation on public posts.

My main blocker that I haven't looked too much into yet, is that to check if the instance has enabled translations, you have to use the v2 api which is not implemented by most of the other backend software.

Design wise, I'd probably do as you said with the 3 dotted menu but replace the post content in-place since that's all the api provides.

Side note, I really wish I could program ):

While I cannot give you advice, if it makes you feel somewhat motivated, forking tootle into tuba was the first time I touched vala and while the original plan was to just keep the lights on... here we are now!

TriVoxel commented 7 months ago

@GeopJr Well, I was afraid that might be the case. I think it would be good to have a "quick translate" feature using the Mastodon API as you described. It would probably be best to stick to official solutions, and would be simpler. Thanks anyways for taking the time to explain your rationale!

However, I still think having a separate, more powerful translator such as what I described would be beneficial for some users, depending on the language of the posts they are translating. I think the Firefox translator idea is solid, and would be good. Perhaps we could have two translation features, a "quick" one using Mastodon's API, and an "advanced" one similar to what I described, hopefully running locally on the system if possible, perhaps optionally with proprietary options as well, accompanied by an appropriate privacy warning.

We certainly don't want to DDOS the Dialect guys, nor do we want to export sensitive data to a 3rd party, so I 100% agree with your decision.

While I cannot give you advice, if it makes you feel somewhat motivated, forking tootle into tuba was the first time I touched vala and while the original plan was to just keep the lights on... here we are now!

I really appreciate this! It gives me some optimism. I'm really only good at making static webpages and shell scripts, but I really like LibAdwaita and want to make some cool apps, but am intimidated by Rust, C, and Vala... I suppose I just need to start small! In your opinion, should I use Blueprints, or XML?

GeopJr commented 7 months ago

Maybe more advanced translation should be left for Dialect to handle? I can file some feature request to dialect, maybe a uri handler? so we can call something like dialect:// and the dialect app will open?

Maybe a workflow like:

flowchart TD
    A[Translate] --> B{Mastodon API}
    B -->|Success| C[Replace Post]
    B -->|Failed| D{dialect://}
    D -->|Suceess| E[Open Dialect]
    D -->|Failed| F{Dialect's Libretranslate}
    F -->|Success| G[Open In Browser]
    F -->|Failed| H[Uh oh, failed!!]

That way we avoid spamming Dialect's instance if the others are available. Tuba will take care of privacy by only allowing translating public posts. I'll have to ask the dialect team first though

If the above is possible, it's much more preferred as it will promote more the indie app ecosystem instead of adding everything in tuba!

GeopJr commented 7 months ago

I'm really only good at making static webpages and shell scripts, but I really like LibAdwaita and want to make some cool apps, but am intimidated by Rust, C, and Vala... I suppose I just need to start small!

If you are more comfortable with javascript, the gjs team has great guides https://gjs.guide/. Python is also popular for GTK apps. In general use whatever you are most comfortable with! The matrix channels for most languages will also provide support if needed

I suppose I just need to start small! In your opinion, should I use Blueprints, or XML?

Blueprint makes UI layout clearer and if it makes it easier for you, nothing else matters. On Tuba I've been holding off due to lack of guides. It's relatively new and wouldn't want contributors who might not be fully familiar with it struggle, when XML Builder files are widely used in many apps and they can lookup how other apps deal with complex layouts