Closed schneidermic0 closed 6 months ago
Instead of ISO 639-1 we could use following option to represent SAP language
*) Remark: I saw many examples where the locale is rendered with a hyphen instead of an underscore (e.g., "en-US", "en-GB", ...). SAP's APIs serialize it with an underscore.
Similar topic was also discussed in https://github.com/SAP/abap-file-formats/issues/34
Beside the options I mentioned above, we could also use BCP47 language tags.
See also:
RFC5646 section 4.1 states following:
- Use as precise a tag as possible, but no more specific than is justified. Avoid using subtags that are not important for distinguishing content in an application.
As far as I understand this section, we could stay with our existing language tags (e.g., "en" for SAP language "EN" representing English (US)), but can additional information as soon as it is needed. I.e., if it should be English (Great Britain), we could use language tag "en-GB". Same would be valid for any other region/script for the English language.
I have also checked, how SAP's I18N converter classes work (cl_i18n_languages
) for BCP47:
If you convert "en" or "en-US" to SAP1-language, it will return in both cases the same SAP1-language. If you do the same for "en-GB" it will return a different language.
If you convert from a SAP1-language to BCP-language, it will always return the full tag (e.g., "E" will be converted to "en-US". However, here we could (not sure, yety whether we should) shorten the tag to "en".
I tested the behavior (describe above for English) with the language above also with several other languages like German or Chinese. It was the same.
Necessary steps to address this issue:
Decision: We plan to follow the approach of BCP47 language tags (see above). Whenever possible we stick to short language tags using the main language only, whenever possible.
Theoretically, we could replace the existing pattern ("^[a-z]+$"
) in the schema with value "^[a-z]{2,3}(?:-[A-Z][a-z]{3})?(?:-[A-Z]{2})?$"
to address all languages supported by SAP (which is a subset of BCP47 language tags)
We think this would be somehow over engineered. We don't have patterns for other fields so far. Any objections?
This means the schema will only have the addition "minLength": 2
.
Old code for original Language
"originalLanguage": {
"title": "Original Language",
"description": "Original language of the ABAP object",
"type": "string",
"minLength": 2,
"maxLength": 2,
"pattern": "^[a-z]+$"
},
New code for original language
"originalLanguage": {
"title": "Original Language",
"description": "Original language of the ABAP object",
"type": "string",
"minLength": 2
},
Maybe, it is more helpful if we list all supported languages based on SAP Note https://launchpad.support.sap.com/#/notes/73606 in our documentation
I think all necessary steps for the repository are done. @Markus1812 Thanks for your contributions.
I close this issue :)
Currently, it is specified that language fields follow ISO 639-1.
See:
SAP language code representing all possible languages does differentiate not only for the language but also for countries. ISO639-1 does not specify any country information in the language code.
For example there are differentiations in SAP system for different countries with language English. As SAP language code "EN" represents English United States SAP language code "6N" supports English United Kingdom. There are further country-specific SAP language codes for Englsish. But all are represented by ISO 639-1 language code "en".
Same issue also exists for other languages (Arabic, Chinese, Dutch, French, German or Spanish).
See SAP Note https://launchpad.support.sap.com/#/notes/73606.
During serialisation and transforming SAP language into the ISO 639-1 code, the information of the country is lost (or the wrong language code might be stored in the system).