facebook / docusaurus

Easy to maintain open source documentation websites.
https://docusaurus.io
MIT License
55.36k stars 8.3k forks source link

RFC: Docusaurus v2 i18n #3317

Closed slorber closed 3 years ago

slorber commented 4 years ago

Docusaurus v2 i18n


Here is a brain dump of many things to consider for i18n support in v2.

I'll keep this issue updated over time, but feel free to comment if you have anything to say, particularly if you used v1 i18n support and can provide valuable feedback.

Superseed this older issue (that still have interesting content): https://github.com/facebook/docusaurus/issues/2651


Existing translation systems

Links to get inspiration from.

Git fork based translations

Have an upstream repo (often in English), and one fork per language

A translation strategy first seen on Vue translation: each language creates a git fork.

We can build tooling on top of that, so that a translation change made in the upstream repo can trigger new PRs on forked repos, to automate the process and ensure translations stay in sync.

Pros:

Cons:

ReactJS case

Links related to the work of Nat Alison.

Contains some interesting notes on why a SaaS like Crowdin was not a good fit, despite an attempt to use it.

https://reactjs.org/blog/2019/02/23/is-react-translated-yet.html https://github.com/reactjs/reactjs.org/issues/1605 https://github.com/reactjs/reactjs.org-translation https://github.com/reactjs/reactjs.org-translation/blob/master/PROGRESS.template.md https://github.com/facebook/react/issues/8063 https://github.com/reactjs/reactjs.org/issues/82 https://github.com/reactjs/reactjs.org/pull/873

GatsbyJS case

Another translation RFC from Nat Alison, quite close to her work on ReactJS: https://github.com/gatsbyjs/rfcs/blob/master/text/0010-gatsby-docs-localization.md

I don't think this work is in production.

Also some interesting bits on this thread where she explains her unfortunate situation working at Gatsby.

Git, single repo

You have a repo and you just have a folder per language.

Pros:

Cons:

Nuxt case

The nuxt doc is a simple repo with language folders. It works fine, but the author told me it was hard to keep all languages in sync. Looks like a manual process.

TypeScript case

Quite similar, TS website has one languages folder per package: <packageName>/copy/<lang> and the translations are handled on the same github monorepo, but split by package

https://github.com/microsoft/TypeScript-Website/issues/100 https://github.com/microsoft/TypeScript-Website/pull/181

Note: Orta found a way to solve the per-language permission problem, as he created a bot so that code owners can self merge through a github PR comment despite not having git permissions:

https://github.com/microsoft/TypeScript-Website/issues/130#issuecomment-675557535 https://github.com/orta/code-owner-self-merge

Additional notes

I think it's possible to handle the "sync with upstream" problem inside a mono repo by using git patch.

https://stackoverflow.com/questions/9939952/create-a-patch-including-specific-files-in-git

It's a way to emulate the upstream repo -> language forks pattern

SaaS

Using a SaaS like Crowdin / Transifex or others has benefits, like the ability to have advanced translation features (UI, editors supporting various formats (PO, Markdown, ICU key/values), translation memory, automatically pay for platform translators, track translation progress, sync with upstream language, version management...)

Pros:

Cons:

Crowdin

Solution suggested by Docusaurus 1, free plan for open-source, used by Docusaurus site v1, Jest, Yarn, Electron...

We should rather try to make it easy to migrate from v1.

Not everybody like this solution however.

Some drawbacks mentioned here: https://github.com/gatsbyjs/rfcs/blob/master/text/0010-gatsby-docs-localization.md#saas-platform-crowdin

Note: some questions I have asked to Crowdin here: https://gist.github.com/slorber/30643299196c7efa77084eec10c1c609

Other SaaS

???


Docusaurus 2 translation system

Unlike presented use-cases, we are a framework, not a site, and we don't serve a single community.

I think we want to be able to support both the developers and non-developers.

We can't expect all Docusaurus translators to be developers, nor git users, yet we know that developers don't necessarily always like the lock-in to a SaaS like Crowdin.

Translation management

I think the translation system should be file-system based, as it's probably the common abstraction between git-based workflows and saas-based workflows

Basically, if you build your site for the fr language, and if you have i18n/fr/docs/myDoc.md, then it should be used for the french page instead of the file at docs/myDoc.md.

I think ./i18n is a good default path to put the translated content, but the paths of such system should be flexible enough so that you can adopt the workflow of your choice, but I thin

So, the first step is to support the first case where you just put the translations in a folder of your site. I'm going to experiment with this on Docusaurus 2 website and try to see if I can provide a french translation.

It's unlikely we'll be able to provide integrations with all the existing translation SaaS, but a 2nd step would be to write integration scripts with Crowdin, so that v1 users can keep using it.

Translation runtime lib

It's likely we'll try to use FBT, a translation tool from Facebook.

I have personally a good experience with React-intl as well and prefer it over many react alternatives.

Translated URLs

Supposing en is the "main" language.

Does https://myDomain.com/en/myDoc exist?

What should be the behavior of the site if the URL does not contain a language, like https://myDomain.com/myDoc ? Is it the English language? Or do we add code to redirect to the most suitable language?

Is it ok for SEO to have a homepage that just redirects? Or is the homepage english? Then which page is the canonical one?

Note: v1 redirects docs, but not the homepage: https://docusaurus.io/ & https://docusaurus.io/docs/installation

Interesting comment (point 5): https://github.com/facebook/docusaurus/issues/2651#issuecomment-660792635

Let's not forget to add the proper page meta tags such as:

<html lang="en">
<link rel="alternate" href="https://myDomain.com/fr/myDoc" hrefLang="fr-FR"/>

See also https://github.com/facebook/docusaurus/issues/2471

(I think if we have this header in pages, it's not needed to add it in sitemaps)

Translated URL schemes

There are multiple ways to handle the URLs of translated pages

https://fr.myDomain.com/myDoc

Using a custom subdomain seems not a very good fit, as it would require one separate deployment per lang (or you'd need to have some custom reverse proxy logic to handle that?).

I don't think this is the workflow we'll encourage, but we could still support this if people really want it. Maybe with an option like docusaurus build --fr, so that it builds a single language site.

Note: this can't be done on simple hosting solutions like Github Pages

https://myDomain.com/fr/myDoc

I think having a path language prefix is a simpler option, and can be easily done with a single deployment.

There's still a choice to be made here:

Both solutions has cons:

For now I think 2 is a better solution

Details and problems to consider

1 SPA per language, dev experience?

As we have seen above, it may be a good idea for performance to split the site into multiple smaller SPAs.

But this also means that we'll build the SPAs independently, but what would be the dev experience if you run docusaurus start?

Do we code something completely different in dev so that the routes of all languages are accessible as a single SPA? Do we instead provide a docusaurus start --lang fr to only run the "french SPA"? I think it's an acceptable tradeoff and have some advantages, but can also be annoying for some users.

Anchor links

Auto-generated ids are a problem for anchor links. As a translator change a heading of some translated markdown file, the id changes, and links from other files do change as well. We should provide an easy way to make the anchors stable across translations

https://github.com/reactjs/reactjs.org/issues/1605#issuecomment-458816106 https://github.com/reactjs/reactjs.org/issues/1605#issuecomment-458819231 https://github.com/ethereum/ethereum-org-website/issues/272 https://github.com/reactjs/reactjs.org/pull/1636/files https://github.com/mdx-js/mdx/issues/810

Right-to-Left support

Support RTL in themes?

Plugin integration

TODO

Doc edit button

If the user is browsing a french doc, and press "edit", he should rather open the correct URL (git or crowdin), so we should make this configurable.

Related: https://github.com/facebook/docusaurus/issues/648

Default language

We should not assume english will be the default language, like in v1.

https://github.com/facebook/docusaurus/issues/3317

Scalability

The build time mostly depends on 3 factors:

To decrease build time and make it sustainable, you can remove older versions from the SPA part, and make them available as a standalone, single version deployment.

We'll work on a cli feature to "archive" older versions more easily: https://github.com/facebook/docusaurus/issues/3286

Fallback

A missing page/translation should be allowed, in such case we'd fallback to the default language and could show a warning

See 6: https://github.com/facebook/docusaurus/issues/2651#issuecomment-660792635

Creating a language

We need a cli to init a language folder based on current language/versions

Creating a version

See proposal here: https://github.com/facebook/docusaurus/issues/2651#issuecomment-660792635

We'll have to snapshot each localized folder too

Asset colocation

It's possible to colocate assets close to the docs. Somehow it permits to use a different image per version. What's the story for i18n? This colocated image would likely end up being copied in the language folders too, so it might be duplicated on multiple axis (version/lang). Is it a good thing? At the same time, if an image contains text, that text could be translated differently so it still makes sense...

Slugs

Should we allow to create custom slugs per language?

If we do that, to be able to switch from one lang to the other without loosing context (the doc you are currently reading), one version would have to be aware of the slugs of all the other language versions, which might be quite a lot of data. How do we access such data in a performant way?

To me, it does not look so critical to be able to switch language and preserving context. If the user wants to browse docs in french, he can go through the french home and browse from there, and it's likely google gives him the docs in the correct language in the first place.

We should try to find a solution though, but this can probably be done later, with some code that would, on language switch request, read some json file emitted by the other language, and then obtain a mapping from document id to slug of the other language.

Note: Yarn 1/classic (Jekyll based?) can switch language and preserve context when doing so, but the slugs are not localized: https://classic.yarnpkg.com/es-ES/docs/usage

Translation mode

If you add the ?translate=true querystring, it could enhance the UI so that we add in-place translation features. It could be possible to integrate with the translation API of a SaaS like crowdin. This is mostly for key/value translations, as markdown docs will be translated as a whole and there's already the editUrl on the docs plugin.


TODO ...

Ongoing PR: https://github.com/facebook/docusaurus/pull/3325

slorber commented 3 years ago

Worth studying the NextJS i18n routing RFC: https://github.com/vercel/next.js/discussions/17078

clairefro commented 3 years ago

Hi there, eavesdropping as I've also been grappling translation approaches for another project

Regarding translation management in a single repo scenario, are you aware of git localize? A system that incorporates or mimics this could make for happy devs

https://gitlocalize.com/

slorber commented 3 years ago

thanks @clairefro , didn't know about this one, will take a look :)

slorber commented 3 years ago

Here are some news about i18n support.

You'll find the i18n RFC here: https://github.com/facebook/docusaurus/issues/3317

The i18n core PR has already been merged but it is not officially released yet. https://github.com/facebook/docusaurus/pull/3325

However, can test it using the @canary npm dist tag (yarn add @docusaurus/core@canary etc) and reading some instructions in that PR.

We are in the dogfooding phase to see if the i18n API and system works fine, and if we need some breaking changes.

We dogfood this on 2 sites:

When tests are ok for these 2 sites, we'll release i18n with proper documentation, hopefully before the end of the year.

The Jest v2 + i18n migration is in progress and can be tracked here: https://github.com/jest-website-migration/jest/issues/2

tomchen commented 3 years ago

Hello, good job, I'm currently using it. And here are my suggestions:

  1. I can add className to any other navbar items but {type: 'localeDropdown'}. Please support className so user can more easily write CSS for it.

  2. {type: 'localeDropdown'} should has an icon by default, just like v1 where it has 2021-01-05_142752. You can imagine a visitor, who does not speak a language at all, lands on a page in that language, and can't find the lang dropdown menu because the title of the lang menu is also in that language...

btw, in case anyone wants, currently I use something like this in Sass to insert the aforementioned lang icon (click to show) ```scss // language bar div.navbar__items.navbar__items--right > div:nth-child(1) > a { padding-left: 40px; &, &:hover { background: url("data:image/svg+xml,%3Csvg viewBox='0 0 24 24' xmlns='http://www.w3.org/2000/svg'%3E%3Cpath d='m0.35579 8.7757v-7.824h1.764v6.336h3.096v1.488z'/%3E%3Cpath d='m7.3323 4.8159-0.192 0.72h1.668l-0.18-0.72q-0.168-0.58802-0.324-1.248-0.156-0.65998-0.312-1.272h-0.048q-0.144 0.624-0.3 1.284-0.144 0.648-0.312 1.236zm-2.832 3.9602 2.448-7.824h2.124l2.448 7.824h-1.872l-0.48-1.86h-2.388l-0.48 1.86z'/%3E%3Cpath d='m11.364 8.7757v-7.824h1.812l2.04 3.888 0.768 1.728h0.048q-0.06-0.624-0.144-1.392-0.072-0.768-0.072-1.464v-2.76h1.68v7.824h-1.812l-2.04-3.9001-0.768-1.704h-0.048q0.06 0.648 0.132 1.392 0.084 0.744 0.084 1.44v2.772z'/%3E%3Cpath d='m21.139 8.9197q-0.80398 0-1.512-0.252-0.696-0.264-1.212-0.768t-0.816-1.248q-0.288-0.75598-0.288-1.74 0-0.97202 0.3-1.728 0.3-0.768 0.816-1.296 0.528-0.528 1.224-0.80398 0.696-0.27598 1.476-0.276 0.85198 0 1.464 0.312 0.61202 0.312 0.99598 0.70798l-0.92402 1.128q-0.3-0.264-0.63602-0.44402-0.33602-0.18002-0.84-0.18-0.456 0-0.84 0.18-0.372 0.168-0.648 0.49202t-0.432 0.792q-0.144 0.46798-0.144 1.056 0 1.212 0.54002 1.884 0.552 0.65998 1.656 0.65998 0.24 0 0.46798-0.06 0.22798-0.06 0.372-0.18v-1.344h-1.296v-1.44h2.856v3.6q-0.408 0.39602-1.08 0.672t-1.5 0.276z'/%3E%3Cpath d='m11.348 12.09h-2.82l1.095-0.615c-0.24-0.54002-0.705-1.335-1.11-1.95l-1.095 0.54002c0.40501 0.615 0.855 1.47 1.08 2.025h-2.715v1.11h5.565zm-0.52502 1.995h-4.4251v1.095h4.4251zm-4.4251 3.06h4.4251v-1.065h-4.4251zm3.315 2.07v2.145h-2.13v-2.145zm1.215-1.125h-4.5449v5.0699h1.2v-0.67498h3.345zm6.3602 1.02v2.43h-4.035v-2.43zm-5.325 4.2452h1.29v-0.6h4.035v0.54002h1.35v-5.4002h-6.675zm4.9051-9.5252v1.995h-2.46c0.12-0.585 0.255-1.275 0.39001-1.995zm1.29 1.995v-3.135h-3.135c0.06-0.44999 0.15-0.91499 0.225-1.335h3.9148v-1.17h-7.6199v1.17h2.325c-0.06 0.41998-0.135 0.88502-0.225 1.335h-1.74v1.14h1.545c-0.135 0.72-0.27 1.41-0.40501 1.995h-1.89v1.155h8.3399v-1.155z'/%3E%3C/svg%3E") no-repeat; } &:hover { color: var(--ifm-navbar-link-color); opacity: 0.6; } html[data-theme='dark'] & { background: url("data:image/svg+xml,%3Csvg viewBox='0 0 24 24' xmlns='http://www.w3.org/2000/svg'%3E%3Cg fill='%23fff'%3E%3Cpath d='m0.35579 8.7757v-7.824h1.764v6.336h3.096v1.488z'/%3E%3Cpath d='m7.3323 4.8159-0.192 0.72h1.668l-0.18-0.72q-0.168-0.58802-0.324-1.248-0.156-0.65998-0.312-1.272h-0.048q-0.144 0.624-0.3 1.284-0.144 0.648-0.312 1.236zm-2.832 3.9602 2.448-7.824h2.124l2.448 7.824h-1.872l-0.48-1.86h-2.388l-0.48 1.86z'/%3E%3Cpath d='m11.364 8.7757v-7.824h1.812l2.04 3.888 0.768 1.728h0.048q-0.06-0.624-0.144-1.392-0.072-0.768-0.072-1.464v-2.76h1.68v7.824h-1.812l-2.04-3.9001-0.768-1.704h-0.048q0.06 0.648 0.132 1.392 0.084 0.744 0.084 1.44v2.772z'/%3E%3Cpath d='m21.139 8.9197q-0.80398 0-1.512-0.252-0.696-0.264-1.212-0.768t-0.816-1.248q-0.288-0.75598-0.288-1.74 0-0.97202 0.3-1.728 0.3-0.768 0.816-1.296 0.528-0.528 1.224-0.80398 0.696-0.27598 1.476-0.276 0.85198 0 1.464 0.312 0.61202 0.312 0.99598 0.70798l-0.92402 1.128q-0.3-0.264-0.63602-0.44402-0.33602-0.18002-0.84-0.18-0.456 0-0.84 0.18-0.372 0.168-0.648 0.49202t-0.432 0.792q-0.144 0.46798-0.144 1.056 0 1.212 0.54002 1.884 0.552 0.65998 1.656 0.65998 0.24 0 0.46798-0.06 0.22798-0.06 0.372-0.18v-1.344h-1.296v-1.44h2.856v3.6q-0.408 0.39602-1.08 0.672t-1.5 0.276z'/%3E%3Cpath d='m11.348 12.09h-2.82l1.095-0.615c-0.24-0.54002-0.705-1.335-1.11-1.95l-1.095 0.54002c0.40501 0.615 0.855 1.47 1.08 2.025h-2.715v1.11h5.565zm-0.52502 1.995h-4.4251v1.095h4.4251zm-4.4251 3.06h4.4251v-1.065h-4.4251zm3.315 2.07v2.145h-2.13v-2.145zm1.215-1.125h-4.5449v5.0699h1.2v-0.67498h3.345zm6.3602 1.02v2.43h-4.035v-2.43zm-5.325 4.2452h1.29v-0.6h4.035v0.54002h1.35v-5.4002h-6.675zm4.9051-9.5252v1.995h-2.46c0.12-0.585 0.255-1.275 0.39001-1.995zm1.29 1.995v-3.135h-3.135c0.06-0.44999 0.15-0.91499 0.225-1.335h3.9148v-1.17h-7.6199v1.17h2.325c-0.06 0.41998-0.135 0.88502-0.225 1.335h-1.74v1.14h1.545c-0.135 0.72-0.27 1.41-0.40501 1.995h-1.89v1.155h8.3399v-1.155z'/%3E%3C/g%3E%3C/svg%3E") no-repeat; } } ```
  1. Use this in <head> for better SEO:
<link rel="alternate" href="https://docusaurus.io/en/" hreflang="en" />
<link rel="alternate" href="https://docusaurus.io/fr/" hreflang="fr" />
  1. For lang global attribute (<html lang="">), you currently trim or simplify zh-Hant / zh-Hans to zh, or pt-BR to pt, etc. You should not. They are valid lang tags per iana assignment (this one is the real official list according to w3c).
    Changing zh-Hant / zh-Hans to zh actually causes font rendering problems (depending on your OS and your OS configurations, <html lang="zh-Hant"> pages could use fonts designed for zh-Hant, but <html lang="zh"> pages could default to zh-Hans fonts)

  2. You said in #3325 the command is docusaurus write-translations --locales all, but:

First, it's actually --locale, without 's'. It's not only the text in PR #3325 that's wrong, I saw warnings in terminal like Available locales=, so you might need to check everything in the code, replacing locales by locale.

Second, I run docusaurus write-translations --locale all, it shows:

Error: Can't write-translation for locale that is not in the locale configuration file.
Unknown locale=[all].
Available locales=[en,fr,zh]

Well, I don't quite understand. I have the locale list in my docusaurus.config.js.

docusaurus.config.js (click to show) ```js i18n: { defaultLocale: 'en', locales: [ 'en', ... ], localeConfigs: { en: { label: 'English', }, ... ```

I had to docusaurus write-translations --locale=en and manually copy them into other lang folders. And it works fine.


This one is just a question, not really an issue or a suggesetion: how to check / get the language of the current page, or, how to modify (e.g. insert tags into the <head>) all pages from a specific language?

slorber commented 3 years ago

Thanks for the feedback @tomchen

  1. I can add className to any other navbar items but {type: 'localeDropdown'}. Please support className so user can more easily write CSS for it.

Agree

  1. {type: 'localeDropdown'} should has an icon by default, just like v1 where it has 2021-01-05_142752. You can imagine a visitor, who does not speak a language at all, lands on a page in that language, and can't find the lang dropdown menu because the title of the lang menu is also in that language...

Agree

  1. Use this in <head> for better SEO:
<link rel="alternate" href="https://docusaurus.io/en/" hreflang="en" />
<link rel="alternate" href="https://docusaurus.io/fr/" hreflang="fr" />

About these meta tags, I didn't think they were 100% required for the initial i18n release (as v1 does not have them).

I'm still trying to figure some things out. What I understand is that we can't simply put the "root" of the localized site here but need to link to the exact same page in the localized site. And then we also want at the same time to localize the slugs (probably for SEO reasons too), but this complicates things as now /hello need to know about the french URL /bonjour, which makes things more complicated.

My idea was to create a /hello page (the original slug) on the localized sites, and redirects to /bonjour with JS, but not sure how happy google will be with that client-side redirect

  1. For lang global attribute (<html lang="">), you currently trim or simplify zh-Hant / zh-Hans to zh, or pt-BR to pt, etc. You should not. They are valid lang tags per iana assignment (this one is the real official list according to w3c).Changing zh-Hant / zh-Hans to zh actually causes font rendering problems (depending on your OS and your OS configurations, <html lang="zh-Hant"> pages could use fonts designed for zh-Hant, but <html lang="zh"> pages could default to zh-Hans fonts)

Thanks, didn't know, will see what I can do.

  1. You said in #3325 the command is docusaurus write-translations --locales all, but:

First, it's actually --locale, without 's'. It's not only the text in PR #3325 that's wrong, I saw warnings in terminal like Available locales=, so you might need to check everything in the code, replacing locales by locale.

Second, I run docusaurus write-translations --locale all, it shows:

Error: Can't write-translation for locale that is not in the locale configuration file.
Unknown locale=[all].
Available locales=[en,fr,zh]

Well, I don't quite understand. I have the locale list in my docusaurus.config.js.

docusaurus.config.js (click to show) I had to docusaurus write-translations --locale=en and manually copy them into other lang folders. And it works fine.

The PR might have been changed a bit and the PR doc is not up to date. The up to date doc is the cli --help flag. Currently writing the i18n doc, will make sure the published doc is correct regarding this.

I removed the --locales all option as I thought it was a bit messy and overkill (as it's possible to run manually the cli command for each locale one after the other). Do you have a usecase for it? Is your site repo public?

This one is just a question, not really an issue or a suggesetion: how to check / get the language of the current page, or, how to modify (e.g. insert tags into the <head>) all pages from a specific language?

You can access the i18n config and current locale using the useDocusaurusContext() hook.

You can wrap your site by using a custom Root component: https://v2.docusaurus.io/docs/next/using-themes/#wrapper-your-site-with-root

Or by using a custom Layout component:

https://v2.docusaurus.io/docs/next/using-themes/#for-site-owners

This gives you the opportunity to use a <Head> (react-helmet) to add additional metadatas on a per-locale basis:

https://v2.docusaurus.io/docs/next/docusaurus-core#head

This is not the most convenient API to do that. I'd like to make the current locale available in the configuration file, just not sure how to do this properly yet. Also we currently build one SPA per locale instead of one SPA for all locales. I'd like to avoid creating an API surface that would prevent us in the future to build all localized sites as a single SPA.


In general, the initial i18n release will not be exhaustive.

It is more the core system we will build on top, and I'd like to avoid rushing creating a hasty API surface, but rather gather some initial feedback and properly design the issues users have

thibaudcolas commented 3 years ago

It’s great to see the progress on this – just chiming in with a few things I think I can help with.

What I understand is that we can't simply put the "root" of the localized site here but need to link to the exact same page in the localized site.

Yes, each page should have bidirectional links to its equivalent in other languages there are translations for. In their official guidance, Google emphasizes it’s important for the links to be bidirectional at least between the "main" language of each page, and its translations (but links between different translations – not as much).

And then we also want at the same time to localize the slugs (probably for SEO reasons too), but this complicates things as now /hello need to know about the french URL /bonjour, which makes things more complicated.

For the validity of hreflang – there is no need to localize the slugs. From my experience I don’t see this done very often if at all, as it leads to the complexities you mention when querying pages by slug. So /fr/hello is perfectly fine. Looking around, I couldn’t find examples of localized slugs on the multilingual sites I’ve been involved with recently.

slorber commented 3 years ago

Thanks @thibaudcolas

Also something to consider is that a Docusaurus site may decide to exclude some parts (like the blog) in localized sites. This is also the reason nextjs does not include hreflang headers by default (https://github.com/vercel/next.js/discussions/17078)

In any case, it's possible for you to insert the hreflang by creating your own <Root> component, including the conditional logic of your choice.

So I guess we could simply assume that we could generate hreflang for all languages and all pages of the site, assuming slugs won't be translated, and site translations will be "complete".

But will leave an option to disable this hreflang generation in case user wants to translate slugs or create partially translated sites, in such case he'll have to figure out how to generate the appropriate headers himself.

tomchen commented 3 years ago

@slorber No I don't have a usecase for --locale all. Not having --locale all is fine for me. I use docusaurus for some small personal py/js library documentation and hobby websites, my usecases are perhaps the simplest. I didn't even look at Crowdin and other i18n tools because I don't think I need them - I just generate i18n/<LANG>/ and edit these current.json files.

What makes my usecase a little special is I have zh-Hans (Simplified Chinese) and zh-Hant (Traditional Chinese), I even wrote a simple script to convert everthing in i18n/zh-Hans/ to i18n/zh-Hant/ before building). It's nice to write in a variant and automatically generate another variant. But a small SEO concern is that Google might think zh-Hans and zh-Hant pages are duplicate content, so, declaring <link rel="alternate" hreflang="" /> in <head>, correct lang code in <html lang="zh-Hant">, and optionally inserting a <link rel="canonical" href="<zh-Hans_URL>" /> on zh-Hant pages, these might be important here.

(Ideally the zh-Hans->zh-Hant conversion, or even any page creation from user data, should be done programmatically when building, but it seems currently users can't do this very easily in Docusaurus like we do it in Gatsby. And for a small website, there's nothing wrong in not doing it programmatically when building, but auto-generating lots of .md files from the data)

Oh, there's another i18n issue I forgot to mention: dropdown menu items in the top nav can be exported into current.json files using docusaurus write-translations --locale <LANG>, but after you translate them in current.json files, you will find they are still in default language and not correctly localized on the website. Non-dropdown top nav items work fine.

slorber commented 3 years ago

Going to close this now but feel free to continue the discussions on i18n and providing feedback.

I'll handle those feedbacks very soon @tomchen. Btw the translation of the theme is not yet 100% exhaustive, I mostly translated the low hanging fruits first but remains some small, less visible labels.

I've created an issue for any discussion related to Crowdin + Docusaurus here: https://github.com/facebook/docusaurus/discussions/4052

slorber commented 3 years ago

FYI I've handled your feedbacks @tomchen , let me know if this looks good to you.

Also worth mentioning that some i18n doc is online here: https://v2.docusaurus.io/docs/next/i18n/introduction

hyochan commented 3 years ago

Really thanks for this! Is there any goal to use FBT optionally?

slorber commented 3 years ago

Thanks @hyochan . We don't plan to have first-class support for FBT, but I clearly want to make the integration possible. That goal will be reached when we'd be able to integrate FBT demos directly in its documentation here: https://facebook.github.io/fbt/ We probably need new lifecycle hooks for that (that will also unlock some CSS-in-JS integrations)

renatodex commented 3 years ago

Just heard that the Docusaurus team got some progress with the i18n feature! That's awesome! I'm the owner of Fables & Goblins, an Open Source RPG System oriented in a Goblin world. And we are using Docusaurus 2.0 to host the Online Book documentation: htttps://www.fabulasegoblins.com.br

Currently, it's entirely written in Brazilian Portuguese, but I would like to try the latest non-stable version to see if I could start to add English as an option. Do you guys think it's reasonably safe to try?

Or would you recommend waiting for a few weeks/months?

slorber commented 3 years ago

@renatodex we'd appreciate if you try it and give us feedback, but in my opinion, it's already working fine and you should try it. Just use the @canary npm dist tag

renatodex commented 3 years ago

Thank you @slorber, I really appreciate all of this awesome work!

mem212 commented 2 years ago

But if you have only one local ( en ) , docusaurus will generate hreflang tags for both x-default and en with the same url AND the canonical tag with the same url also . This can be considered as a conflict on term of SEO and make search engines like Google ignore the doc.

slorber commented 2 years ago

@mxhdx I'm open to suggestion and not an ultimate i18n/SEO expert, however it's important for me that you link to some authority website (like Google SEO documentation) to back your claims.

I think it's reasonable to remove hreflang headers for sites using a single locale, but where have you read that not doing so leads to search engines ignoring the docs?

thadguidry commented 2 years ago

General targeting with x-default

If your page serves up content in a variety of languages or just asks a user to select a preferred page, you can use x-default to show that the page is not specifically targeted. That looks like this:

<link rel="alternate" href="http://example.com/" hreflang="x-default" />

Hreflang's effect on rankings

Hreflang attributes may not help you increase traffic; instead, the goal of using them is to serve the right content to the right users. They help search engines swap the correct version of the page into the SERP based on a user's location and language preferences.

The difference between hreflang and canonicalization

The difference between hreflang and canonicalization

Canonicalization is a tool for showing search engines which version of a URL (each with the same content) is the dominant one to avoid duplicate content issues. Hreflang, on the other hand, is a tool to show which of the different (but often similar) pages (based on language or region) should show up in a search.

Google recommends not using rel="canonical" across country or language versions of your site. But you can use it within a country or language version.

https://moz.com/learn/seo/hreflang-tag https://developers.google.com/search/docs/advanced/crawling/localized-versions

slorber commented 2 years ago

Thanks

@mxhdx @thadguidry I created a dedicated issue for these SEO problems: https://github.com/facebook/docusaurus/issues/6075

I read those pages when I implemented i18n but it's not so easy to interpret for me. Please help me figure out the bad behaviors of Docusaurus with more concrete examples.

I don't remember reading anywhere that using x-default + hreflang en on the same page can lead to Google not indexing a page 🤷‍♂️ But I do agree it may be un-necessarily pre-emptive for unlocalized sites.

thadguidry commented 2 years ago

@slorber oh I was only giving that context and not making a judgement one way or another. If I were to make a judgement then I would say you are right that it is probably unnecessary but not harmful. That's what Google's docs are alluding to.

jovezhong commented 1 year ago

Is that possible to load different font for different locale?

For example, by default I'd like to use Inter font, when the user switched to Chinese locale, I'd like to use a different font, say noto-sans-sc

I cannot find a proper way to do so. I run yarn add @fontsource/inter and set --ifm-font-family-base: 'Inter'; in custom.css. Also import "@fontsource/inter"; in pages/index.js

It'll be great to have per locale font settings

slorber commented 1 year ago

@jovezhong it's not possible, but indeed it makes sense to have per-locale stylesheets in general (allowing you to load fonts and define per-locale CSS variables).

I think I'll provide an API so that you can customize your site config on a per locale basis: this solves many use-cases we have like translating site titles etc... The bad news is that such an API is hard to design correctly, so it will likely be an experimental API that will be used as a temporary workaround

jovezhong commented 1 year ago

Thanks @slorber

Look forwards to such experimental API. Yes, it'll be nice to customize the title/tagline per locale.

Sorry I am not familiar with how docusaurus works. Is there a javascript file each page will load? Maybe I can import the custom font in that JS per locale and set font for --ifm-font-family-base

This is not a blocking issue for me.

slorber commented 1 year ago

The way I see it is to enable config to receive the locale being currently built so you can register a per-locale stylesheet:

module.exports = function configCreator({currentLocale}) {
  return {
    // ...
    stylesheets: ["/locale-" + currentLocale + ".css"]
  };
};

This is a bit weird as an API unfortunately because we would have to run this function once initially with currentLocale: undefined just to be able to retrieve the list of locales your site supports.