mastodon / mastodon

Your self-hosted, globally interconnected microblogging community
https://joinmastodon.org
GNU Affero General Public License v3.0
47.01k stars 6.96k forks source link

I always forget to set post language #19893

Open davidak opened 1 year ago

davidak commented 1 year ago

Steps to reproduce the problem

  1. have set a default language
  2. post something that is not in the default language
  3. notice it after it's posted and it already has favs and replys, so i can't just redraft

Expected behaviour

designe that prevents this issue

Actual behaviour

i just post in my set default language

Detailed description

Post editing helps when you notice it yourself after posting. When creating a post, it should be checked if the set language matches the detected. If not, show it e.g. by making the button red. If one often changes languages, the default language setting can provide "detect language". If it works 99% reliable, we could make that the default.

Specifications

Mastodon v3.5.3 WebUI

ashemedai commented 1 year ago

A possible solution could be to leverage language detection from something like chardet.

davidak commented 1 year ago

They seem to have language auto-detect implemented for translating posts without specified language, so it should be easy to use it in this case as well. https://github.com/mastodon/mastodon/pull/19244

GPhMorin commented 1 year ago

There could be a way to "flag" the language on the message somewhere (see in the timeline or in the message details what is the posting language), and to manually change this language by using the new Edit feature.

GPhMorin commented 1 year ago

I have just noticed that I can change the language using the Edit button. But I still can't visually tell the difference.

promovicz commented 1 year ago

As a mulilingual I would like this solved using language detection, because selecting it manually is a real nuisance.

It also doesn't help that default Mastodon doesn't display the post language anywhere.

davidak commented 1 year ago

Post editing helps when you notice it yourself after posting. When creating a post, it should be checked if the set language matches the detected. If not, show it e.g. by making the button red. If one often changes languages, the default language setting can provide "detect language".

venthur commented 1 year ago

I agree. I believe a lot of people actually forget to set the correct language when posting things. On a big instance such as mastodon.social, I limited the languages I see in the timelines to english and german and still see a firehose of toots in other languages.

I believe some language detection in the frontend, that nudges the user if the detected language does not match the selected (or ignored) language would go a long way making the timelines for everyone more enjoyable.

blackjack75 commented 1 year ago

Not a solution for the web clients but given how many use apps this could simply be solved for a majority (?) if native clients just derived language from the onscreen keyboard used to write the post.

venthur commented 1 year ago

I see that mastodon is using libretranslate which has a feature to detect the language given some text. This could be a good starting point for a fix.

skerit commented 1 year ago

I'm seeing so many posts marked with the wrong language, the language filter feature is almost useless right now. Most of the time English is involved: either the post is in English marked as something else, or the other way around.

I point it out to people sometimes, and the answer is always that they had no idea it was a feature or they they don't know how to change it. Doesn't help that some apps (like Tusky) don't even provide an option to change it per post.

Some people told me Mastodon used to have language detection, but that it didn't work properly so the current system was put in place. Maybe we need a hybrid system. Language detection works a lot better when you limit the amount of possible languages.

nemobis commented 1 year ago

Language detection was removed in https://github.com/mastodon/mastodon/pull/17478

das-g commented 1 year ago

In #21631 I propose a different approach to remedy the problem of forgetting to set the post's language correctly.

ronilaukkarinen commented 1 year ago

I've formed a habit out of this, but I agree with everyhing above, the filtering currently doesn't work as so many forget to set the language.

promovicz commented 1 year ago

Some clients don't even allow setting the language, so some people give up on tagging them.

I have experimented with a Python (sic) library called Lingua for language detection on the Federated feeds. The results looked quite encouraging - especially if the detection is seeded with the users language spectrum on long posts, the language spectrum that they tag with, and possibly their client-side language selection.

I do not understand why earlier experiments at language detection have yielded bad results. It must have been a simplistic implementation that does not seed the detector and does not consider language clustering (people don't speak all languages at the same time, some languages like English tend to co-occur, some languages are very easy to detect, replies should consider the original posts language).

In fact it seems more likely to me that detection will work better than translate, and it's a dependent problem.