Open tofi86 opened 7 years ago
P.S.: I just wrote this down from the top of my head after a long Prague weekend, so I hope I haven't forgotten something. Octavian, Nico, Patrik, Vanessa, please add to the discussion If I missed something!
Yes, schematron needs to select the correct language for diagnostics. If there is a bug, i wil gix it.
As for combined diagnostic files, I suppose there would also be the approach of making the URL for the include/href (more likely to be the new extends/@href) dynamic: Allow {} like {concat ('file://xxxx/diagnostics_', $lang, '.sch')}
Regards Rick
On 13 Feb 2017 10:13, "Tobias Fischer" notifications@github.com wrote:
P.S.: I just wrote this down from the top of my head after a long Prague weekend, so I hope I haven't forgotten something. Octavian, Nico, Patrik, Vanessa, please add to the discussion If I missed something!
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Schematron/schematron/issues/40#issuecomment-279259300, or mute the thread https://github.com/notifications/unsubscribe-auth/AX3VKdppHVqEy3fUo_XMnbaX8-DIiBRFks5rb5IUgaJpZM4L-qT7 .
Yes, schematron needs to select the correct language for diagnostics. If there is a bug, i wil gix it.
Yeah, at the moment, the default skeleton is not picking up the xml:lang
attribute.
Probably Octavian from oXygen XML (@octavianN) is willing to contribute their fixed version?
I think the hardest part would probably be to get the fixed version into third party tools like Jing...
Allow {} like {concat ('file://xxxx/diagnostics_', $lang, '.sch')}
That's also a nice idea of dynamically referencing the external language files.
On 17/02/2017 16:01, Tobias Fischer wrote:
Yes, schematron needs to select the correct language for diagnostics. If there is a bug, i wil gix it.
Yeah, at the moment, the default skeleton is not picking up the |xml:lang| attribute.
Probably Octavian from oXygen XML (@octavianN https://github.com/octavianN) is willing to contribute their fixed version?
When I spoke to Octavian at XML Prague, he said that oXygen did it by filtering messages, not by not emitting the message in the first place.
Just to clarify, Jing uses inside also a similar approach to Skeleton, but it is a different implementation. Also, there is support only for pre-ISO Schematron, the support for ISO Schematron is very limited, I just enabled that by supporting the new namespace but there is nothing implemented in terms of ISO specific functionality.
The oXygen implementation is available under oXygen/frameworks/schematron/impl/ with the same license as the skeleton - it is a fork we made many years ago, so you can surely get whatever update we made back into the skeleton implementation.
I added a pull request with the multilingual support that we have in oXygen, based on diagnostics. The messages are generated automatically in the language specified by the "langCode"e parameter. If there are no messages in the language specified by the "langCode" parameter, all the messages will be generated prefixed by the language.
PR #63
Awesome, thanks Octavian! 👍 Looking forward to see this merged!
Use one diagnostic per message and wrap localizations in a foreign element with @xml:lang.
<sch:schema xmlns:sch="http://purl.oclc.org/dsdl/schematron" xml:lang="en" >
<sch:title>Example of Multi-Lingual Schema</sch:title>
<sch:pattern>
<sch:rule context="dog">
<sch:assert test="bone" diagnostics="d1">(Optional) Fallback message.</sch:assert>
</sch:rule>
</sch:pattern>
<sch:diagnostics>
<sch:diagnostic id="d1">
<p xmlns="http://www.w3.org/1999/xhtml">English message.</p>
<p xmlns="http://www.w3.org/1999/xhtml" xml:lang="de">German message.</p>
</sch:diagnostic>
</sch:diagnostics>
</sch:schema>
Hey,
after attending the first ever Schematron Users Meetup at XML Prague this year, I'm thrilled to see that schematron is coming back to life — thanks @rjelliffe, @AndrewSales and @tgraham-antenna for your work!
As a contributor to the EpubCheck project (EPUB validation) and the SQF Schematron QuickFix project, I'd like to open up this issue and start a discussion about improvements to the Schematron localization concepts — or at least for the Skeleton implementation.
The EpubCheck project uses Java properties files for localization, but also has several Schematron checks which cannot be localized at the moment because the official Skeleton implementation used by Jing validator does not support this. There has been discussion about this since October 2014 at issue https://github.com/IDPF/epubcheck/issues/474
And more recently, the SQF project struggled with this as well in https://github.com/schematron-quickfix/sqf/issues/1.
Annex G of the ISO Schematron specification defines the use of multilingual Schematron as follows:
However, this never worked in the original Skeleton implementation, as it would display both messages and not only the one from the current locale.
oXygen XML has implemented a workaround for this issue with tweaking the original Skeleton implementation and only showing the current locale. Possibly they can contribute this change as a PullRequest.
However, there's another shortcoming of the diagnostic based localization concept: the developer has to actively reference every language with a separate ID in the
diagnostics
attribute, which makes it hard to add new localizations.At XML prague, Octavian from oXygen XML (@octavianN), Nico from the SQF project (@nkutsche), Patrik (@PStellmann) & Vanessa (@vanessakastmann) from the DITA-SEMIA project and me sat together to discuss the SQF issue https://github.com/schematron-quickfix/sqf/issues/1 but quickly came to the conclusion, that there needs to be made improvements to the localization support in the Schematron standard or the Skeleton implementation in order to properly resolve issues like the EpubCheck or SQF one.
We discussed the following solutions which I want to outline here as a discussion basis. You should also know, that we discussed this with the usecase of externalizing the messages to separate files (e.g. fro Translation Memory Systems) in mind.
Solution 1: Fix the Skeleton
The Skeleton should be fixed to at least support the Annex G example properly: Only output the message in the current locale and not ALL
diagnostic
elements.Solution 2: Remove ID/IDREF constraint from Schematron schema
This is more like a long-term solution as the standardized schema would need to be changed.
What we like to achieve is something like this:
ID
anymore) once and let the Skeleton or any other implementation choose the properdiagnostic
element.xml:lang
attribute with different values when two or morediagnostic
elements with the sameid
are present.Current status: This does not validate because of the ID/IDREF datatypes.
Solution 3a: Do it the Java way (hacky)
In Java you just reference
messages.properties
file and thePropertyReader
implementation takes care of resolving the current Locale. In a german environment for xample, Java would try and look formessages_de.properties
automatically, although this file isn't referenced in the Java class.Schematron could do this as follows:
dog.sch
messages.sch
:messages_de.sch
:{include}_{locale}.sch
everytime it resolves an include.Current status:
dog.sch
would validate without errors, but some of our group had reservations because of the misuse of theinclude
element and also because the german message filemessages_de.sch
isn't referenced anywhere within the SCH. Personally(!) I could live well with the last one, as it's Java style...Solution 3b: Do it the Java way (properly)
To address the issue about misusing the
include
element from solution 3a, I'd like to introduce either a new element for message file references:which would require a
diagnostics
root element… or at least an additional attribute on the
include
element:which would advise Skeleton and any other implementation to look for localized files as well (in the Java form of
{include}_{locale}.sch
).Solution 4: Work with business rules for the referenced
id
'sIn my personal opinion this can't be more than a temporary hack, but it was heavily discussed in the group:
{id}_{locale}
diagnostic element if the current locale does not matchxml:lang
on the root element.Current status: The schematron would validate well.
I layed out different solutions we discussed at our SQF meeting and the more I think about it, the better It would have been to discuss this two days earlier on the Schematron Users Meetup... Anyways...
This should only be a basis for further ongoing discussion and I hope I could make my point why we need improvements to either the standard or the Skeleton.
Kind regards, Tobias
on behalf of Octavian, Nico, Patrik and Vanessa