Open schlichtanders opened 3 years ago
@tlienart you seem to have used i18n in your projects. Can you share your current approach?
Hello Stephan, the bad: there's no baked in support nor a plugin that you can directly use.
The good: it's reasonably easy to do this manually, and I think I'd like users to try a few versions until we find one that seems to work well that we can make available to other users via some package or baked in functionality.
Below are my thoughts when considering i18n for another user a while back.
Let's say you want your base language to be EN and have some FR for some pages (potentially all but in some cases people can only be bothered to translate some). Let's say that your base url is base
(e.g. tlienart.github.io
or tlienart.github.io/project
) what I'm suggesting is that you'd have:
* base/, base/page1/, base/page2/ ...
* base/fr/, base/fr/page1, base/fr/page2/ ...
so far so good, you could also want base/en/...
etc but it's just a small extension of the below. Adding a flag button in your layout that takes the current URL and links to another page with the relevant /fr/
injected is easy as well.
Let's say now that you have the post page1.md
.
You can already do the above by having a copy page1_fr.md
and have a slug that indicates the URL for the page e.g.:
page1.md
+++
author = "The Oracle"
+++
# The Title
This is a sentence.
page1_fr.md
+++
author = "The Oracle"
slug = "fr/page1"
+++
# Le Titre
Ceci est une phrase.
This is fine if you need to do it just for one or two pages and are happy with them potentially diverging over time (e.g. the landing page). Of course there are two disadvantages (1) you need to maintain the slug if you want to change the file name and (2) the translation is not in the same file as the original meaning that maintenance if you want to keep the two files 1-1 is that little bit more annoying.
I'm not familiar with the plugin you mention but my impression is that the result would be similar to the above.
It might be easier to keep the original and the translation(s) in the same file to make maintenance easier. The idea here is then that a post consists of blocks of text and that, each time, you'll provide several versions of the text. This would require a bit more work to fully work but the gist should be clear:
page1.md
+++
author = "The Oracle"
title = (en="The Title", fr="Le Titre")
block1 = (en="""
# This is the first block
With a first sentence.
""", fr = """
# Ceci est le premier bloc
Avec une première phrase
""")
block2 = (en="""
...
""", fr="""
...
""")
# more blocks
all_blocks = [block1, block2, ...]
+++
{{generate all_blocks}}
So you have to maintain a bunch of blocks which are close to one another, using blocks is convenient for maintenance (smaller bits of text) and the only thing left is to have the function generate_page
assemble the full markdown for each language and generate the relevant page. Writing that function generate
is not hard, in fact it could simply itself generate the file page1_en.md
and page1_fr.md
each with the relevant slug, this is probably the easiest way, so you'd have the "parent" (page1) which generates the children page.
I have to go but let me know what you think and if you end up trying one of the version or would like something else
Thank you a lot for your huge help.
I myself would like the second approach, because everything is more self-contained. You could also easily implement a fallback mechanism into the generate
function.
Three questions which pop up:
Switching the language
Adding a flag button in your layout that takes the current URL and links to another page with the relevant /fr/ injected is easy as well.
Do you have a snippet illustrating how this could go?
header and footer
If I understood Franklin then you would have to add a couple of {{ispage en/}} {{insert head_en}} {{end}}
in addition of creating the individual language versions.
Interaction with other Franklin.jl features As this approach kind of uses Julia for everything, the doubt comes up, whether all other Franklin features would still be able to work
I came up with another idea as an evolution of your second suggestion with namedtuples (en = "Title", fr = "Le Titre")
.
It would be nice if Franklin.jl had an option for automatically splitting files with such i18n-namedtuples out to multiple files. We can easily make the interface stable by providing our own wrapper i18n(en = "Title", fr = "Le Titre")
.
This could go like follows (assignments are just examples):
i18n_enable = true
is set to truei18n_languages = ['en', 'fr', 'de']
i18n_default = 'en'
page_active
, and for each language in i18n_languages
, let's call it i18n_active
, do
i18n_active
or falling back to i18n_default
, or if no i18n(en=...)
is given but a plain value, take the plain value.pages/{{i18n_active}}/{{page_active}}
with the respective variablespages/{{i18n_active}}/other_page
i18n_default
gets visited by default The only thing which would be needed in addition is the little helper snippets about how to switch between the languages.
One particular nice thing about this approach is that you can easily extend existing Franklin sites by just changing a couple of parameters to i18n(en = ...)
parameters, plus setting the three new configs i18n_enable
, i18n_languages
, i18n_default
. All the rest should then work out of the box.
As of now, I would actually prefer adding such functionality to Franklin.jl instead of writing a generate
function. Sounds about the same amount of work, but the proposed solution is more intuitive and should easily interact with other extensions as well.
What do you think?
sorry if I missed something in your answer but where do the translations live? do you have separate files with translations?
yes, I thought about that:
create a new page
pages/{{i18n_active}}/{{page_active}}
with the respective subset of variables
so like you suggested for generate
, the preparser would just create all the separate translation files
Of course, alternatively you don't have to create the intermediate translation markdown files, but could directly create the translation html files under __site/
respectively.
That would just be an alternative implementation strategy with the same effect, but might be simpler to implement.
could you write what a "base markdown page" would look like in your original idea? specifically:
just so I understand where you put the text in language_A, language_B, ...
sure.
given i18n_enable = true
and i18n_languages = ['en', 'de']
a base markdown page say pages/simplepage.md
+++
Title = i18n(en="Title", de="Titel")
Body = i18n(en="My Text", de="Mein Text")
Constant = 42
+++
{{ include default_body }}
would get translated to pages/en/simplepage.md
+++
Title = "Title"
Body = "My Text"
Constant = 42
+++
{{ include default_body }}
and pages/de/simplepage.md
+++
Title = "Titel"
Body = "Mein Text"
Constant = 42
+++
{{ include default_body }}
As an additional feature you could combine this approach with the possibility of providing translation markdown pages directly, like pages/en/simplepage2.md
.
pages/*
where *
does not start with one of the languages listed in i18n_languages
.Of course, this can stay as a task for later. Just wanted to point out it would be integratable.
In total the functionality would be impressively similar to polyglot, even a bit more intuitive, thanks to i18n(en=...)
, and the amount of work is not huge.
Easy to understand and flexible, such a i18n support would fit quite well to Franklin it seems to me.
Thanks for your input, I think this all seems reasonable, some notes that we don't necessarily need to solve right now:
+++
txt = """
foo
const x = 5
bar
+++
should properly highlight the code in e.g. Atom or VSCode.
all in all I think what you suggested is very close to the generate
story except the generate
would be on Franklin rather than on the user + some good ideas around the config stuff.
I'll need to think a bit about this, can't promise delivery for it as I'm working on other stuff for Franklin which have higher priority (this is also why I tried to give you paths for how you could do stuff right now so that you're not blocked) but otherwise I think there are good ideas here and it would be a nice addition.
Just understood that Franklin not necessarily puts everything under pages/
top folder. Hence using suffix like ..._en.md
as you originally suggested seems more appropriate.
Could you share a snippet about how to switch the language?
Adding a flag button in your layout that takes the current URL and links to another page with the relevant /fr/ injected is easy as well.
Let's say you have a topnav and that it's described in _layout/head.html
, somewhere appropriate you'd put something like:
{{flags}}
with the understanding that this will inject something like
<a href="/en/page/"><img src="flag_en.png" /></a>
<a href="/de/page/"><img src="flag_de.png" /></a>
if there's several translations of that page available. Here's a sketch of a function that would do some of this:
function hfun_flags()
rpath = locvar(:fd_rpath) # something like /path/to/page
# some logic here to check whether there's, at an appropriate location, some translated version
# corresponding to rpath (this will be different based on what you do)
# ...
has_translation = true
if !has_translation
return ""
end
s = """
<a href="$en_url"><img src="flag_en.png" /></a>
<a href="$de_url"><img src="flag_de.png" /></a>
"""
return s
end
I hope that gives you the idea
yes, that helped, thank you very much
I was motivated to go through the source code and look what would be the easiest way to add this. Please remark if I got something wrong.
I assume that the config.md is parsed first, so that as soon as a normal page get's processed, we already know the global config
At the central method process_file_err
where the final out path is computed, we would need to add the following
i18n_languages
, use a fallback value, say i18n_languages = [nothing]
in case i18n_languages
is not definedi18n_active
which specifies the current element of i18n_languagesi18n_active !== nothing
change the outputpath outp
by prepending i18n_active
At the method set_vars!
add the following
i18n
value = value[something(i18n_active, i18n_default)]
to grab the currently active language, or if that value does not exist, grab the default language (the default language could be configured or be the first language used in the i18n
)At the method link_fixer
add the following
i18n_active
is not the fallback, add it as a prefix to all the paths in the htmlThat is it. Very concise, easy to implement, and with full compatibility with everything else so far as I can see. @tlienart Can you check whether you spot any missing pieces or mistakes?
Slight update:
I think it would make even more sense if the loop happens before config.md
is loaded, so that also in the global config something like myvariable = i18n(en="english", fr="française")
would behave just as if you would have written myvariable = "english"
for the en case and myvariable = "française"
for the fr case.
So the loop would be here right before the global vars gets defined.
The information i18n_active
could be stored as a global variable of the specific loop run.
This would imply that also fd_loop
somehow needs to loop through the i18n_languages
and construct a i18n specific global dict in case only specific watched files changed.
thanks for the suggestions; I won't work on this for 0.10.*
but might be interested in picking this up for the next release. One question that is still unclear to me is how you organise files i.e.: where do the translations actually live. Say you have a page A.md
with some English text on it, do you keep all translations inside it? next to it? what if sometimes there's a translation and sometimes there isn't?
edit: it should be said that I do not like the way polyglot does it because it seems hard to maintain.
edit: it should be said that I do not like the way polyglot does it because it seems hard to maintain.
I hope I could show you that the changes look indeed very simple to maintain. The idea of being able to use something like myvariable = i18n(en=..., fr=...)
was also your favourite way to go forward because then the page structure is clearly shared among translations.
where do the translations actually live.
The translations would live normally in the __site
folder. The key step for this is the Part 1, step 3 above:
i18n_active
So really just the prefixSay you have a page A.md with some English text on it, do you keep all translations inside it? next to it? what if sometimes there's a translation and sometimes there isn't?
the english version would go to en/A/index.html
, the french version to fr/A/index.html
.
I won't work on this for 0.10.*
I am quite interested in this feature and would be motivated to try implementing it myself. In another issue you wrote that you are currently refactoring quite a lot of Franklin. Of course I wouldn't like to implement a feature only that it becomes stale because of orthogonal refactoring.
Could you give me a ping if you think it is kind of safe to implement such i18n?
Another future direction would be to build a plugin system which is powerful enough so that such i18n could be implemented within it. But I guess this would be way more challenging and hence better something for later.
I hope I could show you that the changes look indeed very simple to maintain.
I'm not talking about the code changes in Franklin; I'm talking about how easy it is for a user to maintain translations (i.e. how cumbersome it is to use). This is related with my question of where do the translations live:
I still don't understand where the original files live (let's not worry about paths etc, this is all trivial), if you have a post with I am Winnie the Pooh
in A.md
and you want to also have that post in German, where do you but Ich bin Winnie Puuh
? in A.md
? in A_de.md
?
feel free to try a PR around this of course; I will gladly help review it; & I don't think the refactoring will be orthogonal to what I believe you're suggesting.
if you have a post with I am Winnie the Pooh in A.md and you want to also have that post in German, where do you but Ich bin Winnie Puuh? in A.md? in A_de.md?
The way suggested here would be to have only one A.md
which has as its content
+++
variable = i18n(en="I am Winnie the Pooh", de="Ich bin Winnie Puuh")
+++
...
You should also be able to turn off the i18n processing somehow and write the two files yourself, like {{franklin project root}}/de/A.md
and {{franklin project root}}/en/A.de
or something similar, but this is not the main target of this pullrequest, because you can do that already now.
Hi all,
I am facing difficulties applying internationalization / localization within Franklin.jl In Jekyll there is a lovely plugin called polyglot. Would it be possible to support something similar for Franklin.jl?
thanks a lot