trigram-mrp / fractureiser

Information about the fractureiser malware (June 2023)
Creative Commons Attribution Share Alike 4.0 International
1.1k stars 70 forks source link

Translations #79

Closed fantomitechno closed 1 year ago

fantomitechno commented 1 year ago

Message from @williewillus on how to format and work properly on translations:

Replicate the main tree under lang/zh_cn (e.g.) and note in some way which commit or date at which it was last updated from English. No external services. No branches. Just send PR's under a subfolder and get someone else who knows the language (and can show it) to review. Thanks

Original issue: As you have probably seen, there are quite some forks dedicated to translation.

Doing so is a very good idea so the information can be spread easier but centralising all the traffic to this repository seems better.

But for that we need to agree on a convention on how to name our file and what the file tree will look like.

fantomitechno commented 1 year ago

Current translation forks I was able to find (still haven't checked every fork):

@LukeTech2006 https://github.com/LukeTech2006/fractureiser @Teapot4195 https://github.com/Teapot4195/fractureiser-localized @DominoKorean https://github.com/DominoKorean/fractureiser_KoreanTranslate @3TUSK https://github.com/3TUSK/fractureiser

EnderKill98 commented 1 year ago

I would propose to have one readme per language which has the appropriate language code appended. The english one, can simply stay README.md in my opinion. Others could be named like this: README-CN.md (chinese), README-DE.md (german), ... (or using the 3 letter versions, up to debate).

We should add hints for each language to the main document as well for the sake of overview and the case that people do not look at the files. Whether it's a simple list, or a line (maybe in the target language?) referencing the appropriate readme file is up to debate.

For the docs, I assume it would get quite messy if all the documents had alternative versions. For that reason I suggest creating a folder for translated docs each like for the readme (e.g. docs-cn, docs-de, ...).

For the media subfolder I think it would be optimal to only include it, if the graphics got translated as well. If not, they should simply reference the english ones.

(edit: fixed grammar mistakes)

ItzSwirlz commented 1 year ago

I think we should do es.po or in a folder 'es', have the translated files

I can do spanish. I recommend we use weblate or crowdin

EnderKill98 commented 1 year ago

I have not yet worked with application level translations (po files afaik). Would they render properly inside this repo? Otherwise this sounds like a very interesting approach as well. I would assume that these translation sites would also lower barriers to entry for adding translations.

fantomitechno commented 1 year ago

I'm not entirely sure those platforms would work for entire md files :/

I know crowdin for translating json/yaml with a key/value system

EnderKill98 commented 1 year ago

Another topic we should probably address is also how these translations are gonna be managed or updated. With my initial idea, I'd assume it's quite cumbersome to have to make a PR for each update.

@ItzSwirlz might be the solution here. Otherwise how about outright making separate repos per language (e.g. fractureiser-investigation/fractureiser-cn, fractureiser-investigation/fractureiser-de, ...) so a few people that are willing to manage updates can have commit permissions in those.

Otherwise if the amount of trusted people are not that high anyway, giving them access to this repo for updating their respective languages could also work I assume.

EDIT: However this might increase the barrier to introduce new languages, as adding a new repo each would seem like a big change and therefore discourage contributors from doing the first step in getting a new language started. Also these repos would need to be copied or transferred from some initial draft which is not as easy as just forking this repo.

ItzSwirlz commented 1 year ago

Something like the Wii-Guide would be good maybe: In the _pages/ is a folder of the locale, and then that contains all the pages

https://github.com/RiiConnect24/Wii-Guide/tree/master/_pages

EnderKill98 commented 1 year ago

Something like the Wii-Guide would be good maybe: In the _pages/ is a folder of the locale, and then that contains all the pages

If the target would be a website (e.g. a statically generated one hosted on github.io) that would probably fine. It looks rather unusual for the approach of having it presented on the GH site itself.

fantomitechno commented 1 year ago

Something like the Wii-Guide would be good maybe: In the _pages/ is a folder of the locale, and then that contains all the pages

https://github.com/RiiConnect24/Wii-Guide/tree/master/_pages

yes having the English as default one and having a lang/<code> folder that copy the file tree from root could be a good idea

3TUSK commented 1 year ago

I am summoned.

My current approach is to put everything under ${project-root}/lang/zh-CN/. This is quite a convention for some static page generator, and it doesn't clutter the root directory.

ItzSwirlz commented 1 year ago

cool!

Crowdin is free for FOSS projects but there's an approval process, and i dont feel like going through it for my project, stox, and i dont think its worth it for this. Can we used hosted weblate?

fantomitechno commented 1 year ago

I looked on the Weblate website and there's no mention of translating MardkDown files

Screenshot_2023-06-09-04-38-55-93.jpg

(yes and their website isn't properly translated 👀)

ItzSwirlz commented 1 year ago

well when i tried for stox i kinda used my own free trial so can someone abuse a free trial for now to start the project in a fork or something lol

EnderKill98 commented 1 year ago

If it should be shown on a site, using GitHub Pages would probably be the most sensitive approach. That way a static site generator would convert the markdown files to a website using GitHub Actions which would automatically receive it's own github.io site.

I have not worked with them extensively (especially not in the GH context), but have seen a lot of repos that do this. Also most static site generators are quite good at rendering html from markdown and have good language support.

3TUSK commented 1 year ago

I feel like we are unnecessarily complicate things.

Right now I have most of the files translated, the only things left are the "Follow-up" section in technical details, as well as the meeting minutes. We haven't reach the scale where Weblate/Crowdin is necessary yet.

GitHub PR should be fine. However, A more interesting question to ask, is that how do you verify the translation accuracy?

fantomitechno commented 1 year ago

we need reviewers from the same language as the translation that can understand English too

ItzSwirlz commented 1 year ago

I feel like we are unnecessarily complicate things.

Right now I have most of the files translated, the only things left are the "Follow-up" section in technical details, as well as the meeting minutes. We haven't reach the scale where Weblate/Crowdin is necessary yet.

GitHub PR should be fine. However, A more interesting question to ask, is that how do you verify the translation accuracy?

fair enough, though the PRs might be a bit stacked and heavy.

For me, since I'm learning Spanish, I use wordreference and spanishdict to help me out, along with other previous examples in other guide/project translations. Additionally, I can check it with a spanish teacher/someone who natively speaks it. My teacher is from Cuba

EnderKill98 commented 1 year ago

I feel like we are unnecessarily complicate things.

I agree. Starting simple would probably the best. Both using translation sites or generating a static site would require a lot of upfront work (unless someone experienced in it can make it work quickly).

GitHub PR should be fine. However, A more interesting question to ask, is that how do you verify the translation accuracy?

I think for the time being we should have one or two people per translation that are trusted to translate appropriately. Having them review PRs related to a language and giving their okay for merging would probably be a good way to spread the maintenance burden.

Especially when starting with a handful of languages, I don't see the concern of a cluttered file tree. If it grows too much we can always consider changing the approach. Markdown files are quite flexible and I don't think a migration would be a big deal.

fantomitechno commented 1 year ago

For me, the PR "workflow" would be:

EnderKill98 commented 1 year ago

If we have an approach, we can use Pull Request Templates to guite pull requests into categories (language updates, other things, etc.) and use them to guide people to properly name them and all necessary information in an easy-to-view manner.

3TUSK commented 1 year ago

we should have one or two people per translation that are trusted to translate appropriately.

The key issue is akin to what this incident about: we need to establish trust somehow. How would you determine that a translator is trust-worthy?

EnderKill98 commented 1 year ago

The key issue is akin to what this incident about: we need to establish trust somehow. How would you determine that a translator is trust-worthy?

To a certain degree we just need to trust them. The work case harm would be a misleading translation. I don't see why people would be motivated to do this tbh.

For initially new languages, we could probably have people from the IRC verify translations (or have them outright be responsible if they like).

Also, we would need to ask I someone providing a new translation is even willing to check off future ones in the first place.

EnderKill98 commented 1 year ago

Also I tested a bit GitHubs templating on my fork. I don't see a way to make new PRs having a fancy selector like a added for issues there sadly. Only thing is providing a pre-filled description with markdown that explains stuff (e.g. as HTML Comments so only the creator sees them) and have the check boxes that they read stuff or prompt them to fill in information.

EnderKill98 commented 1 year ago

Ah, I see there is already a PR for a new language (#80). I'm not opposed to use that structure (lang/Country-Language/\<Repo-Files>). We can then just list out all the translation in the main, english readme and point to the respective readme.

To reduce the risk of completely over-engineering this matter, should we just go ahead and use that structure?

3TUSK commented 1 year ago

To reduce the risk of completely over-engineering this matter, should we just go ahead and use that structure?

I would go ahead and open PR. I have already used this structure; with more PRs using the structure, we can carve this into stone via "convention over configuration".

Teapot4195 commented 1 year ago

Lang/\<lang> or docs/\<lang>

Since were already using the docs folder I feel like we should just add subfolders for translations.

ItzSwirlz commented 1 year ago

Lang/\<lang> or docs/\<lang>

Since were already using the docs folder I feel like we should just add subfolders for translations.

imo, lang files in docs/ is not as easy to see when lang/ is in the root directory ;)

EnderKill98 commented 1 year ago

Since were already using the docs folder I feel like we should just add subfolders for translations.

This could be confusing. Since GitHub shows you the readme of a directory pretty similar to the main repo, it would probably be best to just clone the structure into it. It would basically look like a translated mirror of the repo and people only caring about the language don't need to poke around other areas.

3TUSK commented 1 year ago

A more nitty-picky details that may not apply to most of other locale. One of the following must be chosen:

This might be applicable to Spanish or even English itself (en-UK, en-CA, anyone?).

Teapot4195 commented 1 year ago

Ok let's go with lang/\<lang> then, we'll just have to make sure to enforce it for all new translations.

Teapot4195 commented 1 year ago

en-UK and en-CA are too similar to en-US to warrant a translation imo

EnderKill98 commented 1 year ago

A more nitty-picky details that may not apply to most of other locale. One of the following must be chosen: [...]

I think the first looks good. It is also used on this site which can also serve as a good reference.

Regarding the multiple countries, same translations issue, how about for those we just omit the country? (e.g. en for english) and then? Not sure if that would look good. Maybe just scrap the Lang-Country formatting and just go for chinese, german, etc. instead?

3TUSK commented 1 year ago

en-UK and en-CA are too similar to en-US to warrant a translation imo

I know there won't be. The key is the nitty-picky "upper vs lower case" and "underscore vs dash".

how about for those we just omit the country?

Possible. Then you'd still need to choose one from zh-HanS and zh_HanS.

Forgetting about the ISO 639 and ISO 3166 is also possible. I initially rolled with zh-CN because of my legacy as a mod translator years ago, where I dealt with tons of zh_CN.langs.
simplified-chinese may be more readable to some who are not familiar with those ISO-639.

EnderKill98 commented 1 year ago

So do we wanna go with the english language name in lower case then?

Example structure:

In this example the english README would stay the main one as it is right now. If you point someone to a folder like lang/spanish, GitHub should show them a pretty similar view to the main repo one which makes it pretty convenient imo.

Translations can also include their own ./docs/media folder if they translate the info-graphics. Otherwise they can just refer to the english ones until someone translates them.

I know, having the english version not conform to the convention of other languages seems a bit odd and like playing favourites. However I think because of these points, that it should stay that way:

3TUSK commented 1 year ago

Example structure: [...]

I would agree with this structure.

A static site still sounds a bit far-reached given the fluctuating situation. I'd wait until the team decides whether they'd like a static site or not.

EnderKill98 commented 1 year ago

On another note, if we're gonna use human readable english language names, we can probably also rename "lang" into something easier to understand and more inviting as well. How about "translation" or "translations" instead?

ItzSwirlz commented 1 year ago

I agree with the structure suggested, I was thinking use the language name in the native language (so a folder named español) but I think those would get it if needed, but the locale imo does it. Either is good

EnderKill98 commented 1 year ago

I'm fine with either as well. :+1:

EnderKill98 commented 1 year ago

I agree with the structure suggested, I was thinking use the language name in the native language (so a folder named español) but I think those would get it if needed, but the locale imo does it. Either is good

I relayed the question to the IRC. I think we should just go with that received a more positive response their (pseudo vote).

xyzeva commented 1 year ago

I would think branches would be better for this instead of a format like this.

3TUSK commented 1 year ago

I would think branches would be better for this instead of a format like this.

One major issue with branches is that, normal user won't easily sense that "there are translations available" unless they have at least rudimentary understanding of git.

Other than that, I am fine with branches.

xyzeva commented 1 year ago

I would think branches would be better for this instead of a format like this.

One major issue with branches is that, normal user won't easily sense that "there are translations available" unless they have at least rudimentary understanding of git.

Other than that, I am fine with branches.

We could make a translations section in the readme linking to the branches

3TUSK commented 1 year ago

I would think branches would be better for this instead of a format like this.

One major issue with branches is that, normal user won't easily sense that "there are translations available" unless they have at least rudimentary understanding of git. Other than that, I am fine with branches.

We could make a translations section in the readme linking to the branches

Hum, that also works. Waiting for other people's input.

EnderKill98 commented 1 year ago

We gain a fancy drop down, people need to click to see anyway and make maintaining with git worse imo. To me it seems like a mis-use of git branches, as they are not meant to be essentially different repos.

Edit: (I have dealt with repos that use branches for denote different categories of files in the past and I never found it pleasant to browse or maintain. Are there any good establish examples of people using branches for translations by chance?)

fantomitechno commented 1 year ago

I would think branches would be better for this instead of a format like this.

One major issue with branches is that, normal user won't easily sense that "there are translations available" unless they have at least rudimentary understanding of git. Other than that, I am fine with branches.

We could make a translations section in the readme linking to the branches

Hum, that also works. Waiting for other people's input.

Branches do look like a good idea but will require that translators have writing access or that an issue has to be done every time we want to add a new language

3TUSK commented 1 year ago

I would think branches would be better for this instead of a format like this.

One major issue with branches is that, normal user won't easily sense that "there are translations available" unless they have at least rudimentary understanding of git. Other than that, I am fine with branches.

We could make a translations section in the readme linking to the branches

Hum, that also works. Waiting for other people's input.

Branches do look like a good idea but will require that translators have writing access or that an issue has to be done every time we want to add a new language

Not necessarily. Translator can target main first, then let team member create branch, then let translator re-target to the new branch.

fantomitechno commented 1 year ago

Oh I didn't know that

3TUSK commented 1 year ago

Oh I didn't know that

Screen Shot 2023-06-09 at 12 27 36 AM

Re-targeting, in action.

3TUSK commented 1 year ago

Some more suggestions after quick discussion on IRC.

I have timestamp-ed all my translated pages with "last updated at [date]" (in English and the target language) at the very beginning. This should give a clear idea about whether the translation is outdated or not.

williewillus commented 1 year ago

There's too much bikeshedding on here so I'm going to make an executive decision.

Replicate the main tree under lang/zh_cn (e.g.) and note in some way which commit or date at which it was last updated from English. No external services. No branches. Just send PR's under a subfolder and get someone else who knows the language (and can show it) to review. Thanks