sveltejs / kit

web development, streamlined
https://svelte.dev/docs/kit
MIT License
18.74k stars 1.95k forks source link

i18n brainstorming #553

Open Rich-Harris opened 5 years ago

Rich-Harris commented 5 years ago

We've somewhat glossed over the problem of internationalisation up till now. Frankly this is something SvelteKit isn't currently very good at. I'm starting to think about how to internationalise/localise https://svelte.dev, to see which parts can be solved in userland and which can't.

(For anyone unfamiliar: 'Internationalisation' or i18n refers to the process of making an app language agnostic; 'localisation' or l10n refers to the process of creating individual translations.)

This isn't an area I have a lot of experience in, so if anyone wants to chime in — particularly non-native English speakers and people who have dealt with these problems! — please do.

Where we're currently at: the best we can really do is put everything inside src/routes/[lang] and use the lang param in preload to load localisations (an exercise left to the reader, albeit a fairly straightforward one). This works, but leaves a few problems unsolved.

I think we can do a lot better. I'm prepared to suggest that SvelteKit should be a little opinionated here rather than abdicating responsibility to things like i18next, since we can make guarantees that a general-purpose framework can't, and can potentially do interesting compile-time things that are out of reach for other projects. But I'm under no illusions about how complex i18n can be (I recently discovered that a file modified two days ago will be labeled 'avant-hier' on MacOS if your language is set to French; most languages don't even have a comparable phrase. How on earth do you do that sort of thing programmatically?!) which is why I'm anxious for community input.


Language detection/URL structure

Some websites make the current language explicit in the pathname, e.g. https://example.com/es/foo or https://example.com/zh/foo. Sometimes the default is explicit (https://example.com/en/foo), sometimes it's implicit (https://example.com/foo). Others (e.g. Wikipedia) use a subdomain, like https://cy.example.com. Still others (Amazon) don't make the language visible, but store it in a cookie.

Having the language expressed in the URL seems like the best way to make the user's preference unambiguous. I prefer /en/foo to /foo since it's explicit, easier to implement, and doesn't make other languages second-class citizens. If you're using subdomains then you're probably running separate instances of an app, which means it's not SvelteKit's problem.

There still needs to be a way to detect language if someone lands on /. I believe the most reliable way to detect a user's language preference on the server is the Accept-Language header (please correct me if nec). Maybe this could automatically redirect to a supported localisation (see next section).

Supported localisations

It's useful for SvelteKit to know at build time which localisations are supported. This could perhaps be achieved by having a locales folder (configurable, obviously) in the project root:

locales
|- de.json
|- en.json
|- fr.json
|- ru.json
src
|- routes
|- ...

Single-language apps could simply omit this folder, and behave as they currently do.

lang attribute

The <html> element should ideally have a lang attribute. If SvelteKit has i18n built in, we could achieve this the same way we inject other variables into src/template.html:

<html lang="%svelte.lang%">

Localised URLs

If we have localisations available at build time, we can localise URLs themselves. For example, you could have /en/meet-the-team and /de/triff-das-team without having to use a [parameter] in the route filename. One way we could do this is by encasing localisation keys in curlies:

src
|- routes
   |- index.svelte
   |- {meet_the_team}.svelte

In theory, we could generate a different route manifest for each supported language, so that English-speaking users would get a manifest with this...

{
  // index.svelte
  pattern: /^\/en\/?$/,
  parts: [...]
},

{
  // {meet_the_team}.svelte
  pattern: /^\/en/meet-the-team\/?$/,
  parts: [...]
}

...while German-speaking users download this instead:

{
  // index.svelte
  pattern: /^\/de\/?$/,
  parts: [...]
},

{
  // {meet_the_team}.svelte
  pattern: /^\/de/triff-das-team\/?$/,
  parts: [...]
}

Localisation in components

I think the best way to make the translations themselves available inside components is to use a store:

<script>
  import { t } from '$app/stores';
</script>

<h1>{$t.hello_world}</h1>

Then, if you've got files like these...

// locales/en.json
{ "hello_world": "Hello world" }
// locales/fr.json
{ "hello_world": "Bonjour le monde" }

...SvelteKit can load them as necessary and coordinate everything. There's probably a commonly-used format for things like this as well — something like "Willkommen zurück, $1":

<p>{$t.welcome_back(name)}</p>

(In development, we could potentially do all sorts of fun stuff like making $t be a proxy that warns us if a particular translation is missing, or tracks which translations are unused.)

Route-scoped localisations

We probably wouldn't want to put all the localisations in locales/xx.json — just the stuff that's needed globally. Perhaps we could have something like this:

locales
|- de.json
|- en.json
|- fr.json
|- ru.json
src
|- routes
   |- settings
      |- _locales
         |- de.json
         |- en.json
         |- fr.json
         |- ru.json
      |- index.svelte

Again, we're in the fortunate position that SvelteKit can easily coordinate all the loading for us, including any necessary build-time preparation. Here, any keys in src/routes/settings/_locales/en.json would take precedence over the global keys in locales/en.json.

Translating content

It's probably best if SvelteKit doesn't have too many opinions about how content (like blog posts) should be translated, since this is an area where you're far more likely to need to e.g. talk to a database, or otherwise do something that doesn't fit neatly into the structure we've outlined. Here again, there's an advantage to having the current language preference expressed in the URL, since userland middleware can easily extract that from req.path and use that to fetch appropriate content. (I guess we could also set a req.lang property or something if we wanted?)

Base URLs

Sapper (ab)used the <base> element to make it easy to mount apps on a path other than /. <base> could also include the language prefix so that we don't need to worry about it when creating links:

<!-- with <base href="de">, this would link to `/de/triff-das-team` -->
<a href={$t.meet_the_team}>{$t.text.meet_the_team}</a>

Base URLs haven't been entirely pain-free though, so this might warrant further thought.


Having gone through this thought process I'm more convinced than ever that SvelteKit should have i18n built in. We can make it so much easier to do i18n than is currently possible with libraries, with zero boilerplate. But this could just be arrogance and naivety from someone who hasn't really done this stuff before, so please do help fill in the missing pieces.

andreasnuesslein commented 2 years ago

Hi guys, I've been following this thread for a while, please consider allowing extraction of strings to .po files. Please think about the translators too, it would be great to be able to take advantage of the tools like Transifex, otherwise your translation strings will be inaccessible to the translators.

Just so you know, there are good alternatives too, like https://weblate.org/ and they will support more than just PO. But either way, In my opinion converting a format into another is not the "core" point of this thread and there are already tools for this out there that you could just include in your build process.

benmccann commented 2 years ago

Cool, but how would you go with not having the language code in the URL for the default language? Looks like all Svelte i18n projects struggle with this same problem :) It's not really the fault of the i18n library. It's more the current limitation of SvelteKit that does not support it in an easy way.

@Catsvilles @ivanhofer I think you can use rest parameters like [...locale] to add an optional route parameter.

AlexxNB commented 2 years ago

It is possible to retrieve locale from event.url.pathname in handle hook and remove it from URL before resolving. изображение

ivanhofer commented 2 years ago

I get your point, however Accept-Language is a value that can be controlled by the user (through a browser setting) in contrast to the IP address, where the region of the user is just guessed and which has probably no relation to the language the user is speaking. But I believe we're getting off topic here 😉

A traditional language picker should also be offered. Maybe you are browsing the web usally in german, but a website is translated badly, so you want to take a look at the english version ;)

@Catsvilles @ivanhofer I think you can use rest parameters like [...locale] to add an optional route parameter.

@benmccann is't this the same what I was referring to?

benmccann commented 2 years ago

@benmccann is't this the same what I was referring to?

I'm not sure. It sounded like you were saying you should just have a single route responsible for all pages especially since you said SvelteKit does not have easy support for omitting the default lang from the route. Perhaps I misunderstood you. What I was suggesting is that it shouldn't be to hard to do something like src/routes/[...locale]/the/rest/of/your/path.svelte where locale can either be included or omitted

ivanhofer commented 2 years ago

@benmccann thanks for the clearification and the detailed example. I actually didn't know this is possible! Thanks

adiguba commented 2 years ago

Hello,

I'm new here and I'm just discovering Svelte/Sveltekit. However I maintain a website in 7 languages and I have some remarks related to this...

The discussion is already long and I tried not to repeat too much.

Language detection/URL structure

Having the language expressed in the URL seems like the best way to make the user's preference unambiguous. I prefer /en/foo to /foo since it's explicit, easier to implement, and doesn't make other languages second-class citizens. If you're using subdomains then you're probably running separate instances of an app, which means it's not SvelteKit's problem.

I don't agree with this : different domains can be used for various reasons (SEO, marketing, etc.) and can still point to the same instance. In my organization we used to have differents domains that point to the same apps with different locale-settings. We used also the 3 distinct configuration (cookie, domain and path) for differents sites.

So I think sveltekit should not impose a specific url scheme, but let everyone choose the most suitable solution (using configuration or implementing some methods).

For exemple I think we can use somes configurations :

Exemples :

  1. Fixed locale :

    i18n: {
    locales: "en"   // only one locale
    },
  2. Cookie-based locale :

    i18n: {
    locales: ["en", "fr", "it"],    // 3 locales
    defaultLocale: "en",        // The default locale (string of function)
    detect: "cookie",       // Detection based on cookie
    // Optionnal : 
    cookie: "lang"          // Name of the cookie
    },
  3. Locale based on path-prefix (access to "/" will redirect to defaultLocale) :

    i18n: {
    locales: ["en", "fr", "it"],    // 3 locales
    defaultLocale: "en",        // The default locale (string of function)
    detect: "path",         // Detection based on a path prefix
    // Optionnal : Specific path name (use locale if missing)
    paths {
        "en": "english",
        "fr": "francais",
        "it": "italiano"
    },
    },
  4. Locale based on domain :

i18n: {
    locales: ["en", "fr", "it"],    // 3 locales
    detect: "domain",       // Detection based on domain
    // Required : map domain/domain :
    domains {
        "www.mysite.com": "en",
        "www.mysite.fr" : "fr",
        "it.mysite.com": "it"
    }
},
  1. Custom function for detecting language
i18n: {
    locales: ["en", "fr", "it"],    // 3 locales
    detect: function(event) {
        let lang;
        ... // JS code to detect lang based on any criteria
        return lang;
    }
},

Note : in any case, the static sites generator should produce a separate version for each language.

Localisation in components

I think the best way to make the translations themselves available inside components is to use a store: Yes and no.

Stores are great for localisation in an "app-like" site : you change the language and the page is updated automatically. But that makes the components more complex, by subscribing to the store an generate more code on the update() method...

Ex this simple code :

<h1>{$M.hello}</h1>
<p>{$M.howareyou}</p>

will generate an update method like this :

    p(ctx, [dirty]) {
        if (dirty & /*$M*/ 1 && t0_value !== (t0_value = /*$M*/ ctx[0].hello + "")) set_data(t0, t0_value);
        if (dirty & /*$M*/ 1 && t2_value !== (t2_value = /*$M*/ ctx[0].howareyou + "")) set_data(t2, t2_value);
    },

But changing the locale is a "rare" event, and in most website, it will be more acceptable to reload the page... So using a simple javascript objet should be an good alternative.

Maybe we can use an config for that :

i18n: {
    use_store: false
}

With use_store=false, changing the language will cause the new page to (re)load with the correct URL. With use_store=true, the language file is loaded and apps is updated without browser's reload (and of course this is incompatible with domain-based locale).

Localised URLs

With localised URLs we need a way to localize links.

Ex :

<a href="/meet-the-team">Meet the team</a>

Should be remplaced by something like that :

<script>
import { t, p } from '$app/stores';
</script>
<a href={$p('/{meet-the-team}')}>{$t.meet_the_team}</a>

We also need a way to get the different locations of the same page in different locale.

=> In server-side in order to generate the alternate links :

<link rel="alternate" hreflang="en" href="https://www.mysite.com/en/meet-the-team/">
<link rel="alternate" hreflang="de" href="https://www.mysite.com/de/triff-das-team/">

=> On the client in order to update the link correctly when changing the locale.

404 and redirection

It would be useful to be able to manage 404 errors, for example to generate the correct redirections. Ex: Instead of render an 404 error for /de/meet-the-team/, we want to generate a redirect to /de/triff-das-team/

I think we can use a global hooks for that, executed before each 404 errors :

export async function handle404(event) {

    if ( /* I found the correct URL */ ) {
        // make a redirection :
        return {
                    status: 302,
                    redirect: "/the-new-uri"
            };
    }

    // we continue on the 404 error 
    return null;
}

This may also have other usage unrelated to locale...

Translation file format.

Of course, Sveltekit will have to come up with a fixed format for the translation files. But it would be desirable to be able to use other formats, for example by offering a conversion function in the configuration.

i18n: {
    file_formats: [
        ext: "po", // file with .po extension
        // conversion method :
        conv: function(file) {
            let messages = {}
            // ... transform file into a JS Object
            return messages;
        }
    ]

This could also make it possible to simplify migration for another environment (by keeping a format known to translators)

Localized template

Another interesting feature would be to be able to have localized svelte file in order to have different template for some specific pages.

To take the example of {meet_the_team}.svelte, by default this would generate the list of all team members. But the French team is very small and we would like to use a more suitable template.

We can use a specific suffix for that, for exemple :

src
|- routes
   |- {meet_the_team}.svelte
   |- {meet_the_team}@fr.svelte

This would also be useful for layouts, for example to have more specific headers/footers :

src
|- routes
   |- __layout.svelte
   |- __layout@fr.svelte
   |- __layout@it.svelte

My 2 cents, Thank for reading...

lsabi commented 2 years ago

Seeing all these different experiences with multi language sites and different use cases, I would say that the best shot for kit is to aim at creating a flexible API for developing a library, as the ones that have been already developed.

Supporting every use case is impossible for kit and supporting all of them in one library is, in my opinion, impossible as well. Thus, exposing a rich API could be the best solution. Libraries with specific features and goals will arrive from the community.

madeleineostoja commented 2 years ago

Agree, low-level API is always the best first step regardless

vekunz commented 2 years ago

I think a key point is that the language detection mechanism must remain dynamic. So the user should be able to decide how the language is detected. Basically, there are four different strategies for language detection: header-based, path-based, query-parameter-based, and cookie-based. Sometimes a mixture is needed. All strategies should be possible, to cover all or at least most use-cases.

A bad example is Next.js in my opinion (React world). Next.js only supports path-based language detection (officially), and many users are very disappointed with this decision because they either need header-based or cookie-based language detection.

bitdom8 commented 2 years ago

if we can pass page.params to after handling resolve or in `

Githubissues.
  • Githubissues is a development platform for aggregating issues.