Open UltraPhil opened 10 years ago
The schema should support inputting data in differently localized ways. The parser would then need to be told which one to use. This also needs to be supported by the templates, since also those need to be localized.
It would probably be easiest to support different languages separately in the schema and in the templates, which could have different language versions.
Maybe something along the lines of:
summary:
{en: string, }
{de: string, }
(or something like that, json is not my strong suite)
Resume objects could be put inside a language (specifying language would be mandatory), like this:
{
"EN_CA": {
"bio": {},
"work": [],
...
},
"FR_CA": {
"bio": {},
"work": [],
...
}
}
Or, we could also simply repeat information in blocks. If no language is specified, EN_US is presumed. This might be easier to translate and maintain in different languages as all the languages are in the same sections. So if your resume gets longer, you don't have to scroll a lot to see what's missing translation. But it would be harder to parse.
{
"bio": {
"EN_CA": {},
"FR_CA": {}
},
"work": {
"EN_CA": [],
"FR_CA": []
},
...
}
I'm in favour of assuming a default locale and not requiring locale designations if there's only one locale.
One of the technologies mentioned in #42, W3C JSON-LD, has a lovely, built-in capability for internationalization that maps to a strong underlying model:
which uses the IETF tags
It already has open source implementations in many languages (machine, not natural!) with a large conformance suite.
The syntax provides a few patterns from which to choose, and looks like most of the means of handling that have been mentioned above...
null
, in the case of proper nouns)Here's an example of the first few:
{
"@context": {
...
"ex": "http://example.com/vocab/",
"@language": "ja",
"name": { "@id": "ex:name", "@language": null },
"occupation": { "@id": "ex:occupation" },
"occupation_en": { "@id": "ex:occupation", "@language": "en" },
"occupation_cs": { "@id": "ex:occupation", "@language": "cs" }
},
"name": "Yagyū Muneyoshi",
"occupation": "忍者",
"occupation_en": "Ninja",
"occupation_cs": "Nindža",
...
}
Here's a language map:
{
"@context":
{
...
"occupation": { "@id": "ex:occupation", "@container": "@language" }
},
"name": "Yagyū Muneyoshi",
"occupation":
{
"ja": "忍者",
"en": "Ninja",
"cs": "Nindža"
}
...
}
+1 for internationalization...
And I think something like
{
"bio": {
"EN_CA": {},
"FR_CA": {}
},
"work": {
"EN_CA": [],
"FR_CA": []
},
...
}
is easier to write and maintain (from a CV writer perspective, not a jsonresume programmer's)
+1 One possible approach could be to optionally accept objects for strings.
{
"basics": {
"name": "Martin Wendt",
"label": {"en": "Programmer", "de": "Software Entwickler"},
"phone": "(912) 555-4321",
"summary": {
"en": "(default text)",
"en-GB": "(british english)",
"de": "(german translation)"
}
}
Right, that's precisely what JSON-LD provides, though without as much magic.
So for your example, you would either: specify that summary
is a @container
of type @lang
in the root @context
or add a @context
to the summary... both are a bit more explicit than novel application-level assumptions which require special casing all the time.
{
"@context": {
"label": {"@container": "@lang"},
"summary": {"@container": "@lang"},
},
"basics": {
"name": "Martin Wendt",
"label": {"en": "Programmer", "de": "Software Entwickler"},
"phone": "(912) 555-4321",
"summary": {
"en": "(default text)",
"en-GB": "(british english)",
"de": "(german translation)"
}
}
As for client-facing translation: actually, more reasonable than most, as you'd have so much more context about what you are translating than just a big document.
As a bonus, the data-at-rest could also be made international, as the context can also specify mappings of anticipated terms to the canonical namespace:
{
"@context": {
"jro": "http://jsonresume.org/context/",
"zusammenfassung": "jro:summary"
},
...
"zusammenfassung": ..
}
I have a feeling that this feature will only be needed in specific countries and specific industries. I live in The Netherlands and work in IT, and here pretty much everyone agreed on using English as language for resumes. We're living in an increasingly globalised world, where it makes sense to have one standard language in which to pass your resume around.
The way I see it, this feature is not needed: if you are aiming to apply for jobs abroad/internationally oriented jobs, which require an English resume, you make your resume English. If you only want to work in Germany, you make it German. If you want to apply for jobs both in Germany and abroad (international jobs), and you really feel it is needed to have a resume in both English and German, you can just create 2 resume specs instead of one. Or you could ask yourself if you really want to work in a place where an English resume is not accepted ;)
Either way, I don't see the strict need for this!
+1 for internationalization (for ease of maintenance from the perspective of the CV/resume writer) +1 for assuming one locale if there is only one
^ditto
Great idea!
Question is do we make localization in the form of having multiple languages etc. as a first class citizen within the schema and therefore use a solution such as JSON-LD as suggested by @bollwyvl or could we use something like #203 to provide this.
JSON-LD:
Tagging + duplication:
For making the standard be usable worldwide we need tagging + exporting based on tags anyway. This would also make it possible to localize specific section via duplication and tagging without making the schema itself more complex.
Would love some other arguments for or against the solutions.
I agree for both, but I think the tagging+duplication is far simpler.
Also a tool can be used to export it from a json-LD, if someone want to use it.
I am for simplicity here, but it could be to have the exporter/generator in the org as an external project.
Ok one issue, which would not be solved with #203 are sections, which can only be added once. Such as basics. We have to think about a solution to this before deciding I reckon.
On second thought, I don't ditto internationalisation as part of the schema. I hadn't realised that even the most basic information like name
would need to be an object if the user needs to transliterate alphabets. It'll nest the file to infinity, making it hard to work with. The tags solution mentioned by @stp-ip is much better.
So for i18n I was imaging that we spin up a new repo to eventually be a greater part of theme utils.
The schema itself would never know of i18n other then vendors such as our registry who would inject into the meta data that it was a particular language.
It would look as such:
// ./theme-utils-i18n
/locales
en.json
fr.json
ge.json
Each of those files would look something like this,
// ./locale/fr.json
{
"basics": "bases",
"basics.name": "prénom",
"basics.label": "étiquette",
"basics.location.city": "ville",
...
Such that each path to a schema property e.g. basics.name
would have a translation.
This won't catch all edge cases and I would expect these files to be extended further.
But I think this approach will be easy for us to organize, easy for theme developers to include and offer great flexibility when rendering.
For example on the registry server, we could easily add a query parameter ?lang=fr which will then automatically tell themes who have the package to included to replace the labels.
Theme developers would use it as such:
import {label} from 'jsonresume-theme-utils-i18n';
var resume = JSON.parse('resume.json');
var lang = req.query.lang;
var template = '<p>' + label(lang, 'basics.location.city') + ': ' + resume.basics.location.city + '</p>';
return template;
// <p>Ville: Paris</p>
My suggestion is more related to labels, writing up my thoughts on data now.
So if a user wanted to have multiple translations of their resume that they had written themselves I would rather us just rely on services to handle the conversion instead of relying on a schema based approach.
e.g. I could edit the registry server right now to support it.
The publish command at the moment takes a payload as such {resume: {}}, once received it is saved, and then posted to theme server upon request to generate the html.
The publish endpoint could be edited such that it also accepts an array of resumes such as {resumes: [{}, {}, {}]}
.
The user could write it in one big file such that the first resume in the list is the master and it fails safe such that if a property is missing in the third index, it would fall back to the second and finally to the first.
[
{
meta: {
lang: 'en' // This is a service specific attribute so I am putting it in here
},
basics: {
name: 'Josef',
label: 'Fireman',
website: 'http://goofstuff.com',
},
},
{
meta: {
lang: 'fr'
},
basics: {
label: 'Pompier',
},
},
]
So on the registry server, once I receive such a payload, I would merge it altogether based on which language I am targeting.
This is just one example of how you can implement it at the service level.
So I am at the moment
+1 For service only implementation (Only potentially adding an official lang attribute to the new proposed meta
attribute)
+1 for lang
attribute (and for the use of tooling). Also, I can handle the french translation.
Thanks @thomasdavis for the additional input. So many ideas now in my head. I think with the proposed meta
section we could even sorta support it within the schema. I added my thoughts with some examples within the meta section https://github.com/jsonresume/resume-schema/issues/204#issuecomment-187196874.
I think this way it could be cleaner as we have one basic data source and everything else including localization is overwrite only. Additionally having it within the schema simplifies working with localization, filtering or other meta data for theme devs and tooling providers.
That being said. I think that having a lang
field within the meta
section to tell themes, what the default language is, would be nice.
We should revisit this issue, after we agreed on the basic structure of the meta section. When we use a standardized meta subsection, we could add a language method to it. Let's leave this for now.
Current proposition:
...
meta.localization = [{
"de": [
{"work['gardner AG'].description" : "Deutsche Übersetzung der Beschreibung."},
{"work['gardner AG'].name" : "Gärtner AG"}
]
}
]
...
I find it slightly complicated for the neophyte but doable.
Yeah, but it makes for a clean separation between base content and localization. Also as it's resides in meta it most likely will be translated within a tool, so the assignment will be done by the tool aka hiding most of the complex structure.
@stp-ip Indeed.
For internationalization we can use approach that currently used for development Google Chrome extensions. Most of us don't need internationalization, so they can use scheme "as is". Others may use separated files with translations, like was proposed by @thomasdavis.
Here is an example:
{
"basics": {
"name": "__MSG_basics_name",
"label": "__MSG_basics_label",
"phone": "(912) 555-4321",
"summary": "__MSG_basics_summary"
}
and file with translations:
// ./locale/en.json
{
"__MSG_basics_name": "Martin Wendt",
"__MSG_basics_label": "Programmer",
"__MSG_basics_summary": "....",
}
I think this way simpler for development and clearer for understanding.
Due to the finite nature of resume data size, I think we can use one file as the source of truth and then use tooling to export the various language/job specific resumes. The idea using separate files is good too, I just currently think the meta data proposal makes more sense.
One additional proposal could be to use the meta data section for translations etc., but provide examples on how to split it up into multiple files using the import methods of json schema. That way we have predefined a single file starting point, which can be extended/split up into multiple files.
Thinking about it. The idea to make one source of truth for ease of use with the ability to reference sections from other files makes sense not only for translations, but as a general solution.
Experts could separate their projects and work items into separate items etc.
Any help needed? @stp-ip
Hi, how is this project going?
We are working hard to revive and get progress out of the door. I would say until the end of the year there will be quite a few milestones. Thanks for your continued interest.
how's it going?
Being worked on. Slow but steady progress.
I did this script for myself and I hope that it helps someone in the meantime that the team is working on this.
Any update on this guys? Thx!
Internationalization should be handled by an another package and/or just by copying your resume.json as en.resume.json|es.resume.json. There is no requirement of the schema to support it.
I find it rude to unilaterally close this request after this many people showed interest. But anyway, here is why I disagree with you :
another package [...] as en.resume.json|es.resume.json.
This is exactly the problem. If someone gives me a file named resume.json, I won't know in what language it is written. I'll have to take a look inside, and make a guess... and that's only if I am familiar with the alphabet and can recognize some words !. If I designed a package supporting I18n, I need this meta data to render the appropriate template for any given resume.json.
That is why I really like adding this extra-simple attribute :
meta: {
lang: 'en'
},
I acknowledge my rudeness as I rush through these issues and I also apologize because my explanations are so short/brief.
I will try give a reasonable explanation of my thinking;
Edit: I'm personally just pushing for a v1
schema, I think any frustration is acceptable, and also welcome.
Thanks for your explanation, it's greatly appreciated as it encourages discussions and participation.
It is true that I could request the user to submit the language to render as an argument. However, I feel that this piece of data should be somewhere inside the resume... -> Is it possible that a user would want a Spanish resume (let's say es.resume.json) to be rendered using the french translation of the theme ? That doesn't make sense, he will always want to use the Spanish version of the theme. And this will be the same for every possible theme a user would want to try. Instead of submitting each time the language, it makes more sense to be embedded inside the resume, as it is immutable.
Sorry I 'm not sure what you mean. If you mean the auto-translation of the content by the template renderer that's not what I meant. The template renderer should only include the content of the resume file as is, but selecting the appropriate translation for the theme itself (basically the names of the sections, like "Skills", ...).
made a very hacky solution https://github.com/w-v/jsonresume-multilang
I think at least the cli should accept a language parameter in order to tell the theme which language to compile for. The reasoning behind is, that is easy to have multiple .json files (resume.de.json
and resume.en.json
) but the theme needs to adjust the headings based on the language. So I think one should not only provide resume
to the javascript function generating the html, but also some meta data as a second argument which optionally contains a language string (given to the cli as a param). What do you think?
So I created a new theme that works for me. It exports a German and an English version. Here is a demo on how to use it. It would be perfect if the cli supported a way to specify the language desired so instead of:
yarn resume export -r resume.de.json -t ./node_modules/jsonresume-theme-stackoverflow-react/dist/de CV.pdf
one could simply do:
yarn resume export -r resume.de.json -t stackoverflow-react -l de CV.pdf
So maybe one should open an issue for resume-cli
?
- Why is somebody giving you a resume.json? Are you more than an individual? And if so, can you not just request a language when a user/resume.json is submitted to you? (edit: I get that theme developers have to deal with it, but why can't their theme consumers just send the i18n versions of their resumes?)
the first usage is to send registry link to https://registry.jsonresume.org/mcarbonneaux but never the json itself... or a pdf version of the resume (rendered with cicd pieline)...
when you send the registry url to manage multi linguage you need a solution to selecte the language of the resume... for the moment there no solution to do that with https://registry.jsonresume.org/xxxxx
it would be useful to add ?lang=fr
or https://registry.jsonresume.org/fr.mcarbonneaux
to select the fr
version of resume.json
(by selecting the fr
version in the schema in javascript or in the rendering engine) or by file name like fr.resume.json
or resume.fr.json
for example...
The second usage is to add link to the registry in your personal profile on linkedin (linkedin is now multilanguage), github, or in you personal page... but in that way you cannot know in advance what language the reader prefere to use when read your resume...
in that way you need to have a visual selector in your theme to display teh resume accordingly to the selected language. in that way of dooing you need to have all the language in the json and render in javascript in the browser not in the backend like actual registry url... or to redirect to another registry with language selector like https://registry.jsonresume.org/fr.mcarbonneaux?lang=fr
to display the selected language...
- If you create a template that is i18n, how will you do it? an api? simply specifying the language does not solve the problem of translation. Until there is a good proposition for a good translation, I think for those who wish to have multiple translations can deal with multiple resume.json's
automatique translation with api generaly does not do correct translation in general, manual translation must always be available. in that way you must have all the translation in jsonresume format. i think is more simple maintain resume if you have only one jsonresume file with all language (is more simple to have coherante resume in all language) in it like https://github.com/jsonresume/resume-schema/issues/35#issuecomment-66907403 :
{
"basics": {
"name": "Martin Wendt",
"label": {"en": "Programmer", "de": "Software Entwickler"},
"phone": "(912) 555-4321",
"summary": {
"en": "(default text)",
"en-GB": "(british english)",
"de": "(german translation)"
}
}
or using standard like json-ld : https://github.com/jsonresume/resume-schema/issues/35#issuecomment-48475661
For people living in countries with multiple official languages (i.e. Canada has english and french), what is the recommandation?
It would be nice to be able to repeat sections with a identifier, i.e. FR_CA, EN_CA, etc.. So I would be able to generate PDFs with multiple languages.