Internationalization - Githubissues

UltraPhil commented 10 years ago

For people living in countries with multiple official languages (i.e. Canada has english and french), what is the recommandation?

It would be nice to be able to repeat sections with a identifier, i.e. FR_CA, EN_CA, etc.. So I would be able to generate PDFs with multiple languages.

jaranta commented 9 years ago

The schema should support inputting data in differently localized ways. The parser would then need to be told which one to use. This also needs to be supported by the templates, since also those need to be localized.

It would probably be easiest to support different languages separately in the schema and in the templates, which could have different language versions.

Maybe something along the lines of:

summary: 
    {en: string, }
    {de: string, }

(or something like that, json is not my strong suite)

UltraPhil commented 9 years ago

Resume objects could be put inside a language (specifying language would be mandatory), like this:

{
    "EN_CA": {
      "bio": {},
      "work": [],
      ...
  },
  "FR_CA": {
      "bio": {},
      "work": [],
      ...
  }
}

Or, we could also simply repeat information in blocks. If no language is specified, EN_US is presumed. This might be easier to translate and maintain in different languages as all the languages are in the same sections. So if your resume gets longer, you don't have to scroll a lot to see what's missing translation. But it would be harder to parse.

{
    "bio": {
        "EN_CA": {}, 
        "FR_CA": {}
    },
    "work": {
        "EN_CA": [], 
        "FR_CA": []
    },
    ...
}

DonDebonair commented 9 years ago

I'm in favour of assuming a default locale and not requiring locale designations if there's only one locale.

bollwyvl commented 9 years ago

One of the technologies mentioned in #42, W3C JSON-LD, has a lovely, built-in capability for internationalization that maps to a strong underlying model:

http://www.w3.org/TR/json-ld/#string-internationalization

which uses the IETF tags

http://tools.ietf.org/html/bcp47

It already has open source implementations in many languages (machine, not natural!) with a large conformance suite.

The syntax provides a few patterns from which to choose, and looks like most of the means of handling that have been mentioned above...

A document, or object within a document, can implicitly or explicitly carry a default language
any named string can be given an explicit language (or null, in the case of proper nouns)
special terms can be created that carry specific languages
any value can be turned into a "language map"

Here's an example of the first few:

{
  "@context": {
    ...
    "ex": "http://example.com/vocab/",
    "@language": "ja",
    "name": { "@id": "ex:name", "@language": null },
    "occupation": { "@id": "ex:occupation" },
    "occupation_en": { "@id": "ex:occupation", "@language": "en" },
    "occupation_cs": { "@id": "ex:occupation", "@language": "cs" }
  },
  "name": "Yagyū Muneyoshi",
  "occupation": "忍者",
  "occupation_en": "Ninja",
  "occupation_cs": "Nindža",
  ...
}

Here's a language map:

{
  "@context":
  {
    ...
    "occupation": { "@id": "ex:occupation", "@container": "@language" }
  },
  "name": "Yagyū Muneyoshi",
  "occupation":
  {
    "ja": "忍者",
    "en": "Ninja",
    "cs": "Nindža"
  }
  ...
}

lfilho commented 9 years ago

+1 for internationalization...

And I think something like

{
    "bio": {
        "EN_CA": {}, 
        "FR_CA": {}
    },
    "work": {
        "EN_CA": [], 
        "FR_CA": []
    },
    ...
}

is easier to write and maintain (from a CV writer perspective, not a jsonresume programmer's)

mar10 commented 9 years ago

+1 One possible approach could be to optionally accept objects for strings.

we could use http://tools.ietf.org/html/rfc5646 encoding. This would allow to deliver resumes based on the user's browser preferred language list
'en-US' could be default
Support subtags with fallback:
- when 'summary for en-GB' (british english) is requested, this translation is used
- when 'summary for en-CA' (canadian) is requested, 'en' is returned
- when 'summary for de' (german) is requested, 'de' is returned
- when 'summary for de-AT' (german in Austrian variant) is requested, 'de' is returned
- when 'summary for fr' (french) is requested, 'en' is returned as default
'' could be an alias for 'en'

{
  "basics": {
    "name": "Martin Wendt",
    "label": {"en": "Programmer", "de": "Software Entwickler"},
    "phone": "(912) 555-4321",
    "summary": {
      "en": "(default text)",
      "en-GB": "(british english)",
      "de": "(german translation)"
    }
}

bollwyvl commented 9 years ago

Right, that's precisely what JSON-LD provides, though without as much magic.

So for your example, you would either: specify that summary is a @container of type @lang in the root @context or add a @context to the summary... both are a bit more explicit than novel application-level assumptions which require special casing all the time.

{
  "@context": {
    "label": {"@container": "@lang"},
    "summary": {"@container": "@lang"},
  },
  "basics": {
    "name": "Martin Wendt",
    "label": {"en": "Programmer", "de": "Software Entwickler"},
    "phone": "(912) 555-4321",
    "summary": {
      "en": "(default text)",
      "en-GB": "(british english)",
      "de": "(german translation)"
    }
}

As for client-facing translation: actually, more reasonable than most, as you'd have so much more context about what you are translating than just a big document.

As a bonus, the data-at-rest could also be made international, as the context can also specify mappings of anticipated terms to the canonical namespace:

{
  "@context": {
    "jro": "http://jsonresume.org/context/",
    "zusammenfassung": "jro:summary"
  },
  ...
  "zusammenfassung": ..
}

DonDebonair commented 9 years ago

I have a feeling that this feature will only be needed in specific countries and specific industries. I live in The Netherlands and work in IT, and here pretty much everyone agreed on using English as language for resumes. We're living in an increasingly globalised world, where it makes sense to have one standard language in which to pass your resume around.

The way I see it, this feature is not needed: if you are aiming to apply for jobs abroad/internationally oriented jobs, which require an English resume, you make your resume English. If you only want to work in Germany, you make it German. If you want to apply for jobs both in Germany and abroad (international jobs), and you really feel it is needed to have a resume in both English and German, you can just create 2 resume specs instead of one. Or you could ask yourself if you really want to work in a place where an English resume is not accepted ;)

Either way, I don't see the strict need for this!

osg commented 9 years ago

+1 for internationalization (for ease of maintenance from the perspective of the CV/resume writer) +1 for assuming one locale if there is only one

ghost commented 8 years ago

^ditto

aloisdg commented 8 years ago

Great idea!

stp-ip commented 8 years ago

Question is do we make localization in the form of having multiple languages etc. as a first class citizen within the schema and therefore use a solution such as JSON-LD as suggested by @bollwyvl or could we use something like #203 to provide this.

JSON-LD:

would be done on the schema side
complicates the schema
only helps with the language differences, but not with legal/cultural differences

Tagging + duplication:

data duplication
cleaner schema, more complex resume data
provide a way to not only export the right localized section, but also take into account legal differences such as personal information details

For making the standard be usable worldwide we need tagging + exporting based on tags anyway. This would also make it possible to localize specific section via duplication and tagging without making the schema itself more complex.

Would love some other arguments for or against the solutions.

aloisdg commented 8 years ago

I agree for both, but I think the tagging+duplication is far simpler.

Also a tool can be used to export it from a json-LD, if someone want to use it.

I am for simplicity here, but it could be to have the exporter/generator in the org as an external project.

stp-ip commented 8 years ago

Ok one issue, which would not be solved with #203 are sections, which can only be added once. Such as basics. We have to think about a solution to this before deciding I reckon.

ghost commented 8 years ago

On second thought, I don't ditto internationalisation as part of the schema. I hadn't realised that even the most basic information like name would need to be an object if the user needs to transliterate alphabets. It'll nest the file to infinity, making it hard to work with. The tags solution mentioned by @stp-ip is much better.

thomasdavis commented 8 years ago

So for i18n I was imaging that we spin up a new repo to eventually be a greater part of theme utils.

The schema itself would never know of i18n other then vendors such as our registry who would inject into the meta data that it was a particular language.

It would look as such:

// ./theme-utils-i18n
/locales
  en.json
  fr.json
  ge.json

Each of those files would look something like this,

// ./locale/fr.json
{
  "basics": "bases",
  "basics.name": "prénom",
  "basics.label": "étiquette",
  "basics.location.city": "ville",
   ...

Such that each path to a schema property e.g. basics.name would have a translation.

This won't catch all edge cases and I would expect these files to be extended further.

But I think this approach will be easy for us to organize, easy for theme developers to include and offer great flexibility when rendering.

For example on the registry server, we could easily add a query parameter ?lang=fr which will then automatically tell themes who have the package to included to replace the labels.

Theme developers would use it as such:

import {label} from 'jsonresume-theme-utils-i18n';
var resume = JSON.parse('resume.json');
var lang = req.query.lang;
var template = '<p>' + label(lang, 'basics.location.city') + ': ' + resume.basics.location.city + '</p>';

return template; 
// <p>Ville: Paris</p>

thomasdavis commented 8 years ago

My suggestion is more related to labels, writing up my thoughts on data now.

thomasdavis commented 8 years ago

So if a user wanted to have multiple translations of their resume that they had written themselves I would rather us just rely on services to handle the conversion instead of relying on a schema based approach.

e.g. I could edit the registry server right now to support it.

The publish command at the moment takes a payload as such {resume: {}}, once received it is saved, and then posted to theme server upon request to generate the html.

The publish endpoint could be edited such that it also accepts an array of resumes such as {resumes: [{}, {}, {}]}.

The user could write it in one big file such that the first resume in the list is the master and it fails safe such that if a property is missing in the third index, it would fall back to the second and finally to the first.

[
 {
   meta: {
     lang: 'en' // This is a service specific attribute so I am putting it in here
   },
   basics: {
     name: 'Josef',
     label: 'Fireman',
     website: 'http://goofstuff.com',
   },
  },
  {
    meta: {
      lang: 'fr'
    },
    basics: {
      label: 'Pompier',
    },
   },
]

So on the registry server, once I receive such a payload, I would merge it altogether based on which language I am targeting.

This is just one example of how you can implement it at the service level.

thomasdavis commented 8 years ago

So I am at the moment

+1 For service only implementation (Only potentially adding an official lang attribute to the new proposed meta attribute)

aloisdg commented 8 years ago

+1 for lang attribute (and for the use of tooling). Also, I can handle the french translation.

stp-ip commented 8 years ago

Thanks @thomasdavis for the additional input. So many ideas now in my head. I think with the proposed meta section we could even sorta support it within the schema. I added my thoughts with some examples within the meta section https://github.com/jsonresume/resume-schema/issues/204#issuecomment-187196874.

I think this way it could be cleaner as we have one basic data source and everything else including localization is overwrite only. Additionally having it within the schema simplifies working with localization, filtering or other meta data for theme devs and tooling providers.

stp-ip commented 8 years ago

That being said. I think that having a lang field within the meta section to tell themes, what the default language is, would be nice.

stp-ip commented 8 years ago

We should revisit this issue, after we agreed on the basic structure of the meta section. When we use a standardized meta subsection, we could add a language method to it. Let's leave this for now.

stp-ip commented 7 years ago

Current proposition:

add or use meta section
add subsection localization
strings for a specific language look for a field within the language specific meta section and will fallback to the base resume

...
meta.localization = [{
  "de": [
    {"work['gardner AG'].description" : "Deutsche Übersetzung der Beschreibung."},
    {"work['gardner AG'].name" : "Gärtner AG"}
    ]
  }
]
...

aloisdg commented 7 years ago

I find it slightly complicated for the neophyte but doable.

stp-ip commented 7 years ago

Yeah, but it makes for a clean separation between base content and localization. Also as it's resides in meta it most likely will be translated within a tool, so the assignment will be done by the tool aka hiding most of the complex structure.

aloisdg commented 7 years ago

@stp-ip Indeed.

dmkuznetsov commented 7 years ago

For internationalization we can use approach that currently used for development Google Chrome extensions. Most of us don't need internationalization, so they can use scheme "as is". Others may use separated files with translations, like was proposed by @thomasdavis.

Here is an example:

{
  "basics": {
    "name": "__MSG_basics_name",
    "label": "__MSG_basics_label",
    "phone": "(912) 555-4321",
    "summary": "__MSG_basics_summary"
}

and file with translations:

// ./locale/en.json
{
  "__MSG_basics_name": "Martin Wendt",
  "__MSG_basics_label": "Programmer",
  "__MSG_basics_summary": "....",
}

I think this way simpler for development and clearer for understanding.

stp-ip commented 7 years ago

Due to the finite nature of resume data size, I think we can use one file as the source of truth and then use tooling to export the various language/job specific resumes. The idea using separate files is good too, I just currently think the meta data proposal makes more sense.

One additional proposal could be to use the meta data section for translations etc., but provide examples on how to split it up into multiple files using the import methods of json schema. That way we have predefined a single file starting point, which can be extended/split up into multiple files.

stp-ip commented 7 years ago

Thinking about it. The idea to make one source of truth for ease of use with the ability to reference sections from other files makes sense not only for translations, but as a general solution.

Experts could separate their projects and work items into separate items etc.

doArcanjo commented 7 years ago

Any help needed? @stp-ip

Eyap53 commented 6 years ago

Hi, how is this project going?

stp-ip commented 6 years ago

We are working hard to revive and get progress out of the door. I would say until the end of the year there will be quite a few milestones. Thanks for your continued interest.

kenberkeley commented 6 years ago

how's it going?

stp-ip commented 6 years ago

Being worked on. Slow but steady progress.

kenberkeley commented 6 years ago

MarcoIeni commented 5 years ago

I did this script for myself and I hope that it helps someone in the meantime that the team is working on this.

wmelon84 commented 5 years ago

Any update on this guys? Thx!

thomasdavis commented 3 years ago

Internationalization should be handled by an another package and/or just by copying your resume.json as en.resume.json|es.resume.json. There is no requirement of the schema to support it.

Eyap53 commented 3 years ago

I find it rude to unilaterally close this request after this many people showed interest. But anyway, here is why I disagree with you :

another package [...] as en.resume.json|es.resume.json.

This is exactly the problem. If someone gives me a file named resume.json, I won't know in what language it is written. I'll have to take a look inside, and make a guess... and that's only if I am familiar with the alphabet and can recognize some words !. If I designed a package supporting I18n, I need this meta data to render the appropriate template for any given resume.json.

That is why I really like adding this extra-simple attribute :

   meta: {
     lang: 'en'
   },

thomasdavis commented 3 years ago

I acknowledge my rudeness as I rush through these issues and I also apologize because my explanations are so short/brief.

I will try give a reasonable explanation of my thinking;

Why is somebody giving you a resume.json? Are you more than an individual? And if so, can you not just request a language when a user/resume.json is submitted to you? (edit: I get that theme developers have to deal with it, but why can't their theme consumers just send the i18n versions of their resumes?)
- If you create a template that is i18n, how will you do it? an api? simply specifying the language does not solve the problem of translation. Until there is a good proposition for a good translation, I think for those who wish to have multiple translations can deal with multiple resume.json's

Edit: I'm personally just pushing for a v1 schema, I think any frustration is acceptable, and also welcome.

Eyap53 commented 3 years ago

Thanks for your explanation, it's greatly appreciated as it encourages discussions and participation.

It is true that I could request the user to submit the language to render as an argument. However, I feel that this piece of data should be somewhere inside the resume... -> Is it possible that a user would want a Spanish resume (let's say es.resume.json) to be rendered using the french translation of the theme ? That doesn't make sense, he will always want to use the Spanish version of the theme. And this will be the same for every possible theme a user would want to try. Instead of submitting each time the language, it makes more sense to be embedded inside the resume, as it is immutable.
Sorry I 'm not sure what you mean. If you mean the auto-translation of the content by the template renderer that's not what I meant. The template renderer should only include the content of the resume file as is, but selecting the appropriate translation for the theme itself (basically the names of the sections, like "Skills", ...).

w-v commented 1 year ago

made a very hacky solution https://github.com/w-v/jsonresume-multilang

levino commented 1 year ago

I think at least the cli should accept a language parameter in order to tell the theme which language to compile for. The reasoning behind is, that is easy to have multiple .json files (resume.de.json and resume.en.json) but the theme needs to adjust the headings based on the language. So I think one should not only provide resume to the javascript function generating the html, but also some meta data as a second argument which optionally contains a language string (given to the cli as a param). What do you think?

levino commented 1 year ago

So I created a new theme that works for me. It exports a German and an English version. Here is a demo on how to use it. It would be perfect if the cli supported a way to specify the language desired so instead of:

yarn resume export -r resume.de.json -t ./node_modules/jsonresume-theme-stackoverflow-react/dist/de CV.pdf

one could simply do:

yarn resume export -r resume.de.json -t stackoverflow-react -l de CV.pdf

So maybe one should open an issue for resume-cli?

mcarbonneaux commented 2 months ago

Why is somebody giving you a resume.json? Are you more than an individual? And if so, can you not just request a language when a user/resume.json is submitted to you? (edit: I get that theme developers have to deal with it, but why can't their theme consumers just send the i18n versions of their resumes?)

the first usage is to send registry link to https://registry.jsonresume.org/mcarbonneaux but never the json itself... or a pdf version of the resume (rendered with cicd pieline)...

when you send the registry url to manage multi linguage you need a solution to selecte the language of the resume... for the moment there no solution to do that with https://registry.jsonresume.org/xxxxx

it would be useful to add ?lang=fr or https://registry.jsonresume.org/fr.mcarbonneaux to select the fr version of resume.json (by selecting the fr version in the schema in javascript or in the rendering engine) or by file name like fr.resume.json or resume.fr.json for example...

The second usage is to add link to the registry in your personal profile on linkedin (linkedin is now multilanguage), github, or in you personal page... but in that way you cannot know in advance what language the reader prefere to use when read your resume...

in that way you need to have a visual selector in your theme to display teh resume accordingly to the selected language. in that way of dooing you need to have all the language in the json and render in javascript in the browser not in the backend like actual registry url... or to redirect to another registry with language selector like https://registry.jsonresume.org/fr.mcarbonneaux?lang=fr to display the selected language...

If you create a template that is i18n, how will you do it? an api? simply specifying the language does not solve the problem of translation. Until there is a good proposition for a good translation, I think for those who wish to have multiple translations can deal with multiple resume.json's

automatique translation with api generaly does not do correct translation in general, manual translation must always be available. in that way you must have all the translation in jsonresume format. i think is more simple maintain resume if you have only one jsonresume file with all language (is more simple to have coherante resume in all language) in it like https://github.com/jsonresume/resume-schema/issues/35#issuecomment-66907403 :

{
  "basics": {
    "name": "Martin Wendt",
    "label": {"en": "Programmer", "de": "Software Entwickler"},
    "phone": "(912) 555-4321",
    "summary": {
      "en": "(default text)",
      "en-GB": "(british english)",
      "de": "(german translation)"
    }
}

or using standard like json-ld : https://github.com/jsonresume/resume-schema/issues/35#issuecomment-48475661

jsonresume / resume-schema

Internationalization #35