HTTPArchive / almanac.httparchive.org

HTTP Archive's annual "State of the Web" report made by the web community
https://almanac.httparchive.org
Apache License 2.0
613 stars 175 forks source link

Translate content to Dutch #1750

Open tunetheweb opened 3 years ago

tunetheweb commented 3 years ago

These are the core templates - without which we cannot release any translated chapters. They are in the language specific templates directory:

2022

2021

2020

2019

These are the chapters to be translated, in rough order of popularity. They exist in the content directory:

2022

2021

2020

2019

Additionally the following pages need translated too in the language specific templates directory:

There is no need to translate the chapters HTML pages as they are generated off the markdown combined with the above templates.

Please include "Makes progress on #1750" in all pull requests so a link is created from the PR to this issue.


Common notes for writing consistency are here: https://github.com/HTTPArchive/almanac.httparchive.org/wiki/Translators'-Guide. Feel free to edit that and/or add Dutch-specific extras by editing this comment.

Dutch specific extra advice:

strangernr7 commented 3 years ago

So the html files don't need to be translated right? Just the files under 2020, 2019 and additional files?

tunetheweb commented 3 years ago

The core html files DO need to be translated.

The chapter HTMLs not shown above (e.g. performance.html) do not need to be translated as they are generated from the markdown (e.g. performance.md). We used to store these generated chapter HTML files in git, but no longer do so probably can remove the statement related to that as probably more confusing than useful. Then again if you run the site locally (see instructions in src/README.md) then it will generate these files but they can be ignored from translation point of view.

That make sense, or have a just confused you more than helped? 😀

strangernr7 commented 3 years ago

alright, nah that helped thanks

rviscomi commented 3 years ago

@noah-vdv thanks for your help translating this content! I see the base templates have been checked off. Have those translations been submitted yet?

strangernr7 commented 3 years ago

@noah-vdv thanks for your help translating this content! I see the base templates have been checked off. Have those translations been submitted yet?

No problem. I have checked them off indeed but realized that I might have translated bits that didn't need translating and totally forgot to check if some pieces needed the tags. So I'll double-check those html files and get a PR out on Thursday hopefully.

tunetheweb commented 3 years ago

Feel free to open a draft PR for now and I can give feedback.

To @rviscomi ’s point we normally put your name beside something you’re working on (so others don’t work on it too) but only tick it off after PR is accepted to show its “done”. I’ve updated the first comment to that now.

tunetheweb commented 3 years ago

BTW since you've done the base templates we're ready to send this language live so (at the risk of making this PR even bigger!) can you add Dutch to server/language.py (lines 45-54) and also add nl to supported_languages in config/2019.json and config/2029.json?

strangernr7 commented 3 years ago

Should I translate content such as First Input Delay or keep it as is?

tunetheweb commented 3 years ago

Should I translate content such as First Input Delay or keep it as is?

I would say not as it's a technical term. Not quite code, but close enough. Maybe add the Dutch translation after first time it's used.

strangernr7 commented 3 years ago

What has changed in the 2020 base.html file for it to be translated?

tunetheweb commented 3 years ago

Oh. Looks like you already got the foreword. Most languages didn't. Will remove that one!

Did you see there were some changes to Markup chapter?

strangernr7 commented 3 years ago

Oh. Looks like you already got the foreword. Most languages didn't. Will remove that one!

Did you see there were some changes to Markup chapter?

Alright, yeah I'll get to those before I do the seo chapter

strangernr7 commented 3 years ago

I was going through the 2020 seo chapter and realized there's still 3 TODO's for authors at around line 500

strangernr7 commented 3 years ago

And in some other chapters as well so I'll just do those later

tunetheweb commented 3 years ago

I was going through the 2020 seo chapter and realized there's still 3 TODO's for authors at around line 500

Good spot. I've reviewed them and happy to just remove them. I count 4 though not three (though one is not marked as a TODO but a Note):

https://github.com/HTTPArchive/almanac.httparchive.org/blob/997997be970f7b1d546702366891cff941baa7b1/src/content/en/2020/seo.md#L504

Seems pretty obvious to me so don't think it needs further comment. Maybe it was added since that comment was added.

https://github.com/HTTPArchive/almanac.httparchive.org/blob/997997be970f7b1d546702366891cff941baa7b1/src/content/en/2020/seo.md#L517

Yeah would have been nice to have some interpretation rather than just the stats, but not only chapter to do this, so let's leave for now and remove the TODO.

https://github.com/HTTPArchive/almanac.httparchive.org/blob/997997be970f7b1d546702366891cff941baa7b1/src/content/en/2020/seo.md#L531

Rick edited this and I trust his understanding of Lighthouse so to me this is just a comment for Authors during review so TODO can be removed.

https://github.com/HTTPArchive/almanac.httparchive.org/blob/997997be970f7b1d546702366891cff941baa7b1/src/content/en/2020/seo.md#L779

Same as above so TODO can be removed.

Would you mind removing them from English version too as part of this PR?

And in some other chapters as well so I'll just do those later

I'll try to review those similarly and submit a PR for them. I see Compression, JavaScript, Privacy (currently under re-review anyway), Resource-Hints (this TODO can be removed), Security (not edited yet anyway), and SEO (discussed above). Hold off translating them for now but SEO and Resource Hints can be done.

strangernr7 commented 3 years ago

I was going through the 2020 seo chapter and realized there's still 3 TODO's for authors at around line 500

Good spot. I've reviewed them and happy to just remove them. I count 4 though not three (though one is not marked as a TODO but a Note):

https://github.com/HTTPArchive/almanac.httparchive.org/blob/997997be970f7b1d546702366891cff941baa7b1/src/content/en/2020/seo.md#L504

Seems pretty obvious to me so don't think it needs further comment. Maybe it was added since that comment was added.

https://github.com/HTTPArchive/almanac.httparchive.org/blob/997997be970f7b1d546702366891cff941baa7b1/src/content/en/2020/seo.md#L517

Yeah would have been nice to have some interpretation rather than just the stats, but not only chapter to do this, so let's leave for now and remove the TODO.

https://github.com/HTTPArchive/almanac.httparchive.org/blob/997997be970f7b1d546702366891cff941baa7b1/src/content/en/2020/seo.md#L531

Rick edited this and I trust his understanding of Lighthouse so to me this is just a comment for Authors during review so TODO can be removed.

https://github.com/HTTPArchive/almanac.httparchive.org/blob/997997be970f7b1d546702366891cff941baa7b1/src/content/en/2020/seo.md#L779

Same as above so TODO can be removed.

Would you mind removing them from English version too as part of this PR?

And in some other chapters as well so I'll just do those later

I'll try to review those similarly and submit a PR for them. I see Compression, JavaScript, Privacy (currently under re-review anyway), Resource-Hints (this TODO can be removed), Security (not edited yet anyway), and SEO (discussed above). Hold off translating them for now but SEO and Resource Hints can be done.

Alright, so I'll just remove the 4 in seo and the one in res hints right?

tunetheweb commented 3 years ago

Yeah if translating any of those chapters. Probably a bit confusing if you remove them but not translating that chapter so if you've moved on to another chapter instead then I'll take care of them in next day or two. Let me know what chapter you plan to do next so I know which to remove.

strangernr7 commented 3 years ago

I've removed them from the english version in another branch so I can just make a PR with that

strangernr7 commented 3 years ago

Should we also translate the README (src/README), CONTRIBUTING and CODE_OF_CONDUCT md files? So if people check out this repo they can read it in other languages as well.

tunetheweb commented 3 years ago

I say no. The project is run on English so you need to understand English (or have a translator to help you) if you want to be involved in the project on GitHub.

However the output of the project (the website) is translated to make them as available as possible.

strangernr7 commented 3 years ago

Should github.com links be added to the list of automatically added hreflang="en" ? I don't think I've encountered repos with translated README.MD (or any other) files yet which would make them only available in English.

tunetheweb commented 3 years ago

Could do. I quite like the fact we're not explicitly listed sites (e.g. Mozilla or Wikipedia) but are doing it based in URLs containing /en/ or /en-US/ or https://en. so it's more generic (see below). Adding an explicitly site like github.com does make more of an assumption.


/*
 * Automatically adds language after an anchor if not same language
 *
 * Add for links that are obviously in English
*/
html:not([lang="en"]) main a[href*="/en-US/"]::after,
html:not([lang="en"]) main a[href*="/en/"]::after,
html:not([lang="en"]) main a[href^="https://en."]::after {
  content: '(en)';
  vertical-align: super;
  font-size: 0.6em;
}

/*
 * Add links if an explicit `hreflang` attribute exists
 */
main a[hreflang]::after {
  content: '(' attr(hreflang) ')';
  vertical-align: super;
  font-size: 0.6em;
}

/*
 * Remove it for English in English pages
 * (allows us to add this to base content to make it easier for translators)
 */
html[lang="en"] main a[hreflang="en"]::after {
  content: '';
}

It's the same for https://web.dev resources which are English (though they have launched Polish versions).

I think the better way is to make a bulk change to these at source in the English markdown files. This looks quite easy with Visual Studio code regex replace:

Search: \[([^\]]*)\]\((https:\/\/web.dev[^\)]*)\) Replace: <a hreflang="en" href="$2">$1</a> Files to include: content

And:

Search: \[([^\]]*)\]\((https:\/\/github.com[^\)]*)\) Replace: <a hreflang="en" href="$2">$1</a> Files to include: content

What do you think?

strangernr7 commented 3 years ago

What do you think?

I have no idea. I'm probably not the right person to comment on that seeing as I don't understand it completely 😁.

tunetheweb commented 3 years ago

See #2047 - have a look at the Dutch ones and see if you agree.

strangernr7 commented 3 years ago

https://github.com/HTTPArchive/almanac.httparchive.org/blob/f3975a2f87d1b1ad32a4f2d4cb47f557a9a9ceaf/src/content/en/2020/mobile-web.md#L156 Is mSpeed a typo here or is it supposed to be like that?

tunetheweb commented 3 years ago

Good question!

@spanicker I'm guessing this means "mobile speed" and do see the referenced report uses "mCommerce" in a similar way, so not sure if this is now a thing that passed me by, but maybe better to just spell it out as "mobile speed" to avoid any confusion since this is the only instance in the chapter and that term isn't used in the referenced report?

strangernr7 commented 3 years ago

Just a heads up: the 2020 ecommerce chapter has #jrharalson_bio: TODO above the Intro And I'm pretty sure featured_stat_label_1: Mobile sizes identified as ecommerce sites should be Mobile sites

tunetheweb commented 3 years ago

Yeah we never got a bio from @jrharalson - Jason if you see this and wanna provide one then please do. It appears at the bottom of the chapter.

Agree on the featured stat correction.

strangernr7 commented 3 years ago

https://github.com/HTTPArchive/almanac.httparchive.org/blob/26509af2b51584e6fd597c2bd110670fa52b01fe/src/content/en/2020/caching.md#L121 Should Caching entity stay as is or be translated? As well as anywhere else in the chapter

tunetheweb commented 3 years ago

I would translate it. It's not a technical term as such. Same for Eviction and Revalidation further down that list. On the other hand, Time to Live (TTL) is a well-known technical term so I wouldn't translate that,

strangernr7 commented 3 years ago

https://github.com/HTTPArchive/almanac.httparchive.org/blob/26509af2b51584e6fd597c2bd110670fa52b01fe/src/content/en/2020/caching.md#L200 Also, I noticed Expiries instead of Expires, that's a typo right?

tunetheweb commented 3 years ago

Yup! Good spot. If you could fix as part of your translation pull request that would be much appreciated!

rviscomi commented 3 years ago

Wow only 3 chapters to go! Thank you @noah-vdv for your amazing effort!!

tunetheweb commented 3 years ago

Hey @noah-vdv any chance we could persuade you back to finish off the final to chapters for 2020? Once we have those we can publish the ebook in Dutch! And we’ll even print a copy as a small token of our appreciation you for your efforts!

strangernr7 commented 3 years ago

Hey @noah-vdv any chance we could persuade you back to finish off the final to chapters for 2020?

Hey, wow time has flown! Didn't realise how long ago I had last worked on this. Anyways, yeah of course! However, I don't have as much spare time as before but I'll try my best to get it done, hopefully, by November.

Once we have those we can publish the ebook in Dutch! And we’ll even print a copy as a small token of our appreciation you for your efforts!

Awesome!! Looking forward to finishing it off! 👍

tunetheweb commented 2 years ago

We added a small note to our Accessibility Statement about colour contrast issues:

https://github.com/HTTPArchive/almanac.httparchive.org/blob/14f85122c88d14ee1f8dd0474479424338766625/src/templates/nl/accessibility_statement.html#L69-L71

Would be great if someone could translate this (@noah-vdv ?). Very small and easy!

strangernr7 commented 2 years ago

"In sommige hoofdstukken nemen we ook andere inhoud op de site op, die niet door onszelf is gemaakt, waaronder YouTube-video's, pdf-links of links naar andere artikelen, die niet voldoen aan de strikte criteria die vereist zijn voor WCAG 2.1 level AAA. Zie onderstaande opmerking over externe inhoud."

Hey, wow time has flown! Didn't realise how long ago I had last worked on this. Anyways, yeah of course! However, I don't have as much spare time as before but I'll try my best to get it done, hopefully, by November.

Also, i hate to disappoint, but this is taking more time as I need to spend more time on my study than expected. I, unfortunately can't give you an eta yet.

tunetheweb commented 2 years ago

No probs. Thanks for that translation.

tunetheweb commented 2 years ago

Doh! Looks like I gave you the wrong bit of text to translate above :-( That was the next paragraph that was already translated. This was the paragraph I actually wanted translated:

We recognize that some of the color choices for our visualizations do not meet WCAG color contrast requirements. We make a conscious effort to use the more contrasting colors and labels to reduce the impact of this. We hope the detailed descriptions, as well as access to the underlying data itself can help with this issue. We aim to improve the accessibility of our visualization color schemes in future years.

VictorLeP commented 2 years ago

We erkennen dat sommige keuzes voor kleuren in onze visualisaties niet voldoen aan de kleurcontrastvereisten van de WCAG. We doen een bewuste inspanning om de meer contrasterende kleuren en opschriften te gebruiken om de impact hiervan te verminderen. We hopen dat de gedetailleerde beschrijvingen alsook toegang tot de onderliggende gegevens zelf kunnen helpen met dit probleem. We streven ernaar de toegankelijkheid van de kleurschema's in onze visualisaties de komende jaren te verbeteren.

tunetheweb commented 2 years ago

Bedankt @VictorLeP ! Zeer gewaardeerd.

VictorLeP commented 2 years ago

Will also take a stab at translating this year's Privacy chapter (since @noah-vdv is busy).