superseriousbusiness / gotosocial

Fast, fun, small ActivityPub server.
https://docs.gotosocial.org
GNU Affero General Public License v3.0
3.57k stars 300 forks source link

[feature] Better handling for post languages #2061

Open tsmethurst opened 11 months ago

tsmethurst commented 11 months ago

There a couple of things we could be doing for posts to better surface their language property, which we don't currently do:

We should try and do some or all of these, to make GtS easier to use for folks who post/read posts in multiple different languages!

VyrCossont commented 11 months ago

Today's #2066 is also related to post languages.

tevino commented 10 months ago

Somewhat related: #1277 This one is from the previous year

zordsdavini commented 7 months ago

tried to change the language directly through db - didn't helped

cass-dlcm commented 5 months ago

Hi, I'd like to note that #2066 isn't fully resolved. While BCP 47 states the tag structure, it delegates listing the tags to the Language Subtag Registry, maintained by IANA.

There's language tags in that registry that aren't options in the GoToSocial backend.

I'd like to help work on this, but I'm unfamiliar with the project's code and would like some guidance on how to proceed.

tsmethurst commented 5 months ago

Ah really, darn. Can you reproduce an example where you want to set a language for a post via the API but the (valid) tag isn't parsed correctly?

I would expect that we can cover all "primary language" subtags, but "extended language" and "region" subtags might be spotty.

cass-dlcm commented 5 months ago

I've found that some languages are missing from the "default posting language" setting. I have also found that I can post things in a language that's missing from that setting's options, but I don't know if the language tag in the post is preserved.

I sent this status using fedilab, should be tagged as being in Toki Pona. Can someone check that for me: https://fedi.cass-dlcm.dev/@cassdlcm/statuses/01HMRS1PWD3FDWM85JHR4WYMAA

tsmethurst commented 5 months ago

I've found that some languages are missing from the "default posting language" setting.

Ah right that/s known already. That "default posting language" dropdown is not exhaustive. See https://github.com/superseriousbusiness/gotosocial/issues/2306

I sent this status using fedilab, should be tagged as being in Toki Pona. Can someone check that for me: https://fedi.cass-dlcm.dev/@cassdlcm/statuses/01HMRS1PWD3FDWM85JHR4WYMAA

Language should be federated correctly, this is from the AP json of that status:

  "contentMap": {
    "tok": "<p>toki!</p>"
  },

I see it doesn't show up on the web view of the status though.