twostraws / Ignite

A static site generator for Swift developers.
MIT License
1.53k stars 73 forks source link

Adding localization support to improve accessibility #76

Open Jeehut opened 2 weeks ago

Jeehut commented 2 weeks ago

While accessibility labels and ARIA support seem to exist, many statically generated sites like app landing pages or documentation sites would be vastly more accessible if they were available in multiple languages. From what I understand, it would be already possible to create a view with buttons that link to separate subfolders of articles (such as en, de-DE etc.). But there are a few others things that need to be considered for a website with multi-language support:

  1. A way to adjust the language value of the Site based on the current language settings (this line)
  2. A way to provide different values for String parameters such as for a sites description or a buttons label
  3. A built-in way to know/determine the supported languages of a site during build
  4. A built-in language switcher component that automatically lists the supported languages & sets it on click
  5. Some kind of string table where users can provide different values for different languages

I know that this sounds like adding a dynamic layer to this "static" site generator. Localization in my opinion is for sure a "dynamic" aspect worth providing, but if implemented in a thoughtful manner, we can retain the static aspect by simply creating a duplicate of each page with only the localized texts changed. The language could be provided at root path level via the Language enums raw value, leading to URLs like https://ignitesamples.hackingwithswift.com/de/grid-examples/. We could generate variants of all pages for each supported language and place them in the respective folder (e.g. de) so everything still is static.

But we would need some JavaScript to set the language value based on the path during page load, so things like the metadata have the correct language set. For things like the usage of localizedContains, while building the different language subfolders, we would need to set a custom locale with the current language being generated for accurate results.

As for the string table, I would suggest we use a subset of String Catalogs as we don't need all features, such as "Vary by device". We could even start without pluralization support to keep things simple. I'm happy to donate my Codable struct to parse them, which I have already implemented for one of my apps. Updating that is simple enough in case Apple makes changes. It's versioned as well, so we should be able to notice easily. This year, they didn't change anything to the structure, so it should be pretty stable. But I'm happy to adjust the code whenever needed.

Using String Catalogs rather than a custom format has great advantages:

As not everyone will need or want to localize their sites, we should keep this as opt-in and default to exactly the same behavior without any subpaths like /en. Also, a default language should be set and used as a fallback for when a domain is called without any language subpath. I'm not sure if a safe redirect is even possible on static site calls, so the default locale files should just live in the root directory.

Please let me know what you think about adding localization support and also about my suggestion. I'm happy to do provide an initial version of localization support once you've approved that what I outlined makes sense to you.

twostraws commented 2 weeks ago

Someone else asked me about this by email, and I think it's a great idea – I'd love for Ignite sites to be localized, where possible!

But we would need some JavaScript to set the language value based on the path during page load, so things like the metadata have the correct language set.

I don't fully see why JavaScript is needed. This should all be done at build time, no?

Using String Catalogs rather than a custom format has great advantages:

String catalogs would be a great choice, as long as they are fully supported outside of macOS; people have already started building Ignite sites on Linux.

As not everyone will need or want to localize their sites, we should keep this as opt-in and default to exactly the same behavior without any subpaths like /en. Also, a default language should be set and used as a fallback for when a domain is called without any language subpath. I'm not sure if a safe redirect is even possible on static site calls, so the default locale files should just live in the root directory.

Everything should be flattened at build time, so the whole thing remains static. That would mean using the base language by default (perhaps by changing languages to an array, then using whichever one is listed first?), then push the rest to /es/original/path, /fr/original/path, etc. This should mean no redirect is needed, and there would be no opting in or out. That might be overly simplistic, though, so I'd be keen to hear more on why you want to introduce javaScript here.

In my head, the greatest challenge would be around making sure all a site's content is localized to the supported languages, or if we even want to push for that. For example, if I say my site is English first then French second, what happens if one of my Markdown posts is missing in French? Perhaps this is just a configuration option, but it could get rather messy with things like tag pages.

Jeehut commented 2 weeks ago

String catalogs would be a great choice, as long as they are fully supported outside of macOS; people have already started building Ignite sites on Linux.

When you say "started building on Linux" you mean they do the entire editing part there? So no Xcode available? If that's what you mean, then I understand that it might not be optimal to use String Catalogs as a format due to lack of a GUI editor. But I'm also not sure if any other localization file format has an editor there, and String Catalogs are quite readable JSON at the core, so it might still be a viable option. Building and extracting localizations from the String Catalog should be fully supported even with String Catalogs as we would do the parsing all in Swift code. So if general editing and hosting on Linux is your only concern, that shouldn't be a problem.

Everything should be flattened at build time, so the whole thing remains static. That would mean using the base language by default (perhaps by changing languages to an array, then using whichever one is listed first?), then push the rest to /es/original/path, /fr/original/path, etc. This should mean no redirect is needed, and there would be no opting in or out. That might be overly simplistic, though, so I'd be keen to hear more on why you want to introduce javaScript here.

I agree in general with flattening everything at build time. But the reason I would still add some Javascript is for automatic browser language preference detection. So when a user who prefers French visits the site it automatically shows the French site, not the English one. This could not be achieved without some JavaScript AFAIK, but it would be just a couple lines of code, and we could make it opt-in and not render it at all if only one language is supported. It should also only be there for the default path without a locale inside, so when users call /fr/original/path then it stays in French, no JavaScript included.

This could be important for many use cases where people share links to their app landing page, for example. While they could share different language links with different audiences, at least for the home page of a domain you don't usually do that, you just call the domain and expect it to be in the language you need, like how it works on App Store Connect.

Of course, we could start without this feature. But as an opt-in one I don't see a problem and I would be up to implement it, as I need it for my app landing pages that I'm planning to build with Ignite.

In my head, the greatest challenge would be around making sure all a site's content is localized to the supported languages, or if we even want to push for that. For example, if I say my site is English first then French second, what happens if one of my Markdown posts is missing in French? Perhaps this is just a configuration option, but it could get rather messy with things like tag pages.

Note sure what you mean with tag pages, but if we use String Catalogs and localization markers like String(localized:) and LocalizedStringKey, the String Catalog would exactly show which languages have missing text translations. If someone was looking for a 100% localized site, this would be very helpful and reduce the chance for overlooked translations. For users who rather need a single page localized and nothing else, I would argue they could just create a file at /fr/original/path and manually link to it without using the official localization feature. I don't see an easy way to support part-localization without creating all kinds of problems.

twostraws commented 2 weeks ago
  1. They build on Linux.
  2. No automatic language selection; that would be unwelcome.
  3. Tag pages are used in Ignite to show all content that matches a certain tag. String catalogs would not apply here.
Jeehut commented 2 weeks ago

Without automatic language selection, I can't use Ignite for any of my planned use cases as my apps are multi-language and my app landing pages should be as well. If that's a direction you don't want to go (which I totally respect, of course), I might have to create a fork and maintain that with the dynamic features I need, if that's okay with you.

For the tag pages, aren't they generated like any other page? Why couldn't they be localized as well with links to all localized pages that have a certain tag? Even the tags themselves could be localized, internally they would keep the default locale name but when shown to viewers they could be localized like any other content. I don't see how String Catalogs are a problem there.

twostraws commented 2 weeks ago

It's an open-source project; you can fork it whenever you want.

Automatic language selection will not happen. If you want to direct folks to a particular part of your site, that should be done in your app, in your App Store Connect settings, or similar – you know what language they actually want there, because it's the setting on their phone or overridden on a per-app basis in Settings, and correctly takes into account the user's actual preference rather than trying to auto-detect. If you want users of your app who have expressly said "Ignore my phone's setting, use English for this app" to visit your site and suddenly be given French because that's what Safari is set to, that's down to you.

This is, as far as I know, fairly standard behavior for the web: if I enable French as the primary language on my iPhone then visit apple.com, I get the US English site. If I use US English on my iPad and visit apple.com/jp, I get the Japanese site. The same is true for microsoft.com, intel.com, amd.com, and many more. This appears to be common, so it's hardly unprecedented for Ignite to follow suit.

Ultimately, if all you want to do is inject some custom JavaScript, that's something you can do within the current Ignite system: add your own language detection code, then redirect as needed.

For the tag pages: As I said, my concern is when there's a mismatch between pages that are localized and those that aren't; we'd need to be careful to show only the pages that match the tag in that language, and also need to be careful for places where tags exist in one language but not another. Showing some kind of language selection box would need to take into account that any given page might not actually have a translation available, or might spelled "Artikel" rather than "Articles". We'd likely need to give users some control here (allow mismatched translations? Force matching translations? Some way to link English tag to German equivalent here?), but it's definitely a non-trivial problem that deserves careful thought.

Jeehut commented 2 weeks ago

Alright, I do get your point with websites not always doing the redirection. But Apple, for example, does detect that I'm coming from Germany if I visit apple.com and prominently gives me a button at the top of the page that brings me to apple.de. I also just typed microsoft.com which redirected me to https://www.microsoft.com/de-de/ and amd.com redirected me to https://www.amd.com/de.html – only intel.com resulted in an English page. As you can see, it's also very common to redirect, so at least I think there should be an option.

The use case for me would not be restricted to digital products (where I can link to different languages), but also includes physical printings with a link to the apps website, such as on these cards or any QR code I print on paper / stickers.

But you said I could "add (my) own language detection code, then redirect as needed". That certainly sounds great. But it's one of the things I didn't understand was possible by the current docs. How is it done? Do I add an Include to every single page that I want to localize?

For the tag pages: You are right, giving users some control is not a trivial problem. But for a first version, would you be fine if we only properly supported the "everything is localized" approach? Like I said, the progress percentage indicator in String Catalogs helps detect which languages are missing. For simple websites like app landing pages which only have few languages anyway, this could be a totally viable solution. Or do you want a full-fledged solution from the get to?

twostraws commented 2 weeks ago

I also just typed microsoft.com which redirected me to https://www.microsoft.com/de-de/ and amd.com redirected me to https://www.amd.com/de.html – only intel.com resulted in an English page. As you can see, it's also very common to redirect, so at least I think there should be an option.

Yes, but are they doing that through IP address or JavaScript locale? Lots of sites use GeoIP or similar, which is why if I visit Microsoft from the UK I get the UK site, but if I enable a VPN in Germany I get the German site – despite using the same computer. The same is true for amd.com, and many others: the source IP address is being read, not the computer's settings.

Do I add an Include to every single page that I want to localize?

If you want to include custom JavaScript on every page, the best thing to do is put it directly into a custom theme. This would probably be best expressed as a Script element.

But for a first version, would you be fine if we only properly supported the "everything is localized" approach?

For now, I think switching languages to an array then assuming everything is localized is a good step forward – I don't think it would break anything, because we have no localization at all right now and I suspect almost everyone left the setting as the default.

So, if someone changes languages to be [.en, .de] for example, we would look for the first of those in the existing directory structures, and the second by prefixing "de" to all the URLs. So, for English it would be Content/article/swift-against-humanity.md, but for German it would be Content/de/article/swift-against-humanity.md

How does that sound?

Like I said, the progress percentage indicator in String Catalogs helps detect which languages are missing.

Again, we need to be very careful here to ensure we don't affect building on Linux and other platforms, so whatever we use to handle localization should be able to work on every platform. Apple renewed their cross-platform push in this year's WWDC SOTU, and when dealing with websites Linux is significantly more commonplace; if we don't strive for cross-platform compatibility I think it will come back to bite us.

Jeehut commented 2 weeks ago

So, if someone changes languages to be [.en, .de] for example, we would look for the first of those in the existing directory structures, and the second by prefixing "de" to all the URLs. So, for English it would be Content/article/swift-against-humanity.md, but for German it would be Content/de/article/swift-against-humanity.md

How does that sound?

Sounds good to me. 👍 I'll give it a spin later this summer when I work on my app websites.

Again, we need to be very careful here to ensure we don't affect building on Linux and other platforms, so whatever we use to handle localization should be able to work on every platform. Apple renewed their cross-platform push in this year's WWDC SOTU, and when dealing with websites Linux is significantly more commonplace; if we don't strive for cross-platform compatibility I think it will come back to bite us.

If there only was a SwiftUI-like UI framework for Linux, I would actually build a String Catalog Editor for Linux platforms. I mean, I already have it all in my app and it's also a 100% free feature, so there's no reason I couldn't create an open-source editor for Linux. If you're thinking about this long term, then I believe a native Linux app written in Swift is the best option. I'm happy to provide a first version once we have a Swift-based UI framework for Linux (much like Ignite is a Swift-based UI framework for the web).

Having that said, "dealing with websites Linux is significantly more commonplace" reminds me of where those sites usually are hosted or CI-built on, so I'd like to reiterate that String Catalogs are fully compatible with Linux, it's all just JSON after all, and it's quite human-readable JSON, too. And JSON is editable on Linux without problems. So if I provided a Swift plugin that detected things like String(localized:) or LocalizedStringKey and auto-extracted them into a String Catalog file, it would even work if people want to develop on Linux. I'd be willing to do this work long-term if there is enough interest. I've done something similar with BartyCrouch after all, so I have some experience there.

twostraws commented 2 weeks ago

It seems there are several core tasks here:

  1. Support decoding of strings catalog JSON.
  2. Update all instances of text (including images?) to support internationalization, similar to SwiftUI.
  3. Switch language over to languages, defaulting to [.en]
  4. Add a Text(verbatim:) initializer that skips internationalization.

Once that's done, we could move on to step 5: update content processing to support localized versions. I've separated that off because it's possible to create Ignite sites entirely without Markdown – just a bunch of static pages, which would require only the first four steps above.

Jeehut commented 2 weeks ago

@twostraws Did you outline those steps for clarity or because you prefer a separate PR for each? I prefer feature-complete PRs that I keep small by limiting the scope. But all 4 steps mentioned would be one PR, I think.

I would rephrase the first step to "Support encoding/decoding a subset of the String Catalog JSON format". We don't need to support device-specific variants, I think. (That would require dynamic text loading based on device type.) Likewise, I would also drop pluralization support for version 1. Could be added later if needed.

I'll think about the best way to change language to languages so nothing breaks for existing users. And as a side note, while Text(verbatim:) is available to specify in SwiftUI that a Text view should not be localized, most other views don't have a verbatim initializer overload. One could of course use the label overloads and provide Text inside, but I found that simply passing an explicit String("My unlocalized text") is much simpler to write and easy enough to understand. Because of the type specification the unlocalized String APIs are used then and the text doesn't get localized. One could even introduce a typealias named something like Unlocalized for String to make it even more explicit, but just wanted to let you know that this is a trick I'm using all the time and teaching others, so the Ignite views should support that as well.

Currently I'm still focusing on some Post-WWDC work, so I won't be able to immediately start on the localization work here. But I'll probably start sometime in July. I'll let you know of any progress. I'd probably create a WIP PR early, so you can correct me if I go the wrong direction.

twostraws commented 2 weeks ago

One huge PR would be unwelcome. I'll start on this myself.

Jeehut commented 2 weeks ago

Oh, okay, that's awesome, thank you! 👍