Open SmartLayer opened 4 years ago
Hi Weiwu,
I have created xerces based utility to validate the XML using the XSD file and below are the details.
Command:
$ java -classpath "xercesImpl.jar;xercesSamples.jar;xml-apis.jar;xpath2-1.2.0.jar;XMLValidator.jar" XMLValidator H:/alphawallet/TokenScript/schema/tokenscript.xsd H:/alphawallet/tokenscripts/COFI.xml
Note: You need to replace ; with : (for unix) while adding JAR files in classpath.
All the required JAR files are attached here. xerces-2_12_1-xml-schema-1.1.zip
Arguments:
Now you can validate XML against the XSD 1.1 using this package.
Tracking of requirement details from mail conversation:
It seems that XML Schema 1.1 is only supported by either Xerces 2.12 (the version with XML schema 1.1 support) or with Saxon. Saxon's opensource version, at a glance, only support XSLT and XQuery, since there is no mentioning of validation in the manual.
Once you have the validator, we will need a Pull-Request that not only returns the xmlschema 1.1 rules that I commented out (2 lines), but also change the schema's root element according to this article:
https://www.oxygenxml.com/doc/versions/22.1/ug-editor/topics/set-xml-schema-version.html
Otherwise, some tools will still process it with schema 1.0.
Let me know how you progressed on this! Thanks.
https://github.com/AlphaWallet/TokenScript/issues/395#issuecomment-716395559
Whatever I have here is initial version, we can eventually convert this to your suggested approach in the requirement document.
I can improve the Java Code to take default XSD path from the github and then we just need to pass the xml name to the command. We can also give the option to refer the local XSD schema for the validation.
My idea is to write a shell script where we can pass the action(validate, sign, c14n, verify) as command line argument and then based on action appropriate Java class will be invoked.
My idea is to write a shell script where we can pass the action(validate, sign, c14n, verify) as command line argument and then based on action appropriate Java class will be invoked.
If you do so you will have to produce 2 versions (.sh and .bat) and they may behave a bit different depending on MacOS/Ubuntu. It's no harm if the content is extremely simple, you just need to keep it minimal and test it on all OSes, but in this case it's expected to be complicated - i.e. the fact that you can concatenate sub-commands means it's not going to be simple at all, and whatever shell script you write will have to manage a lot of intermediary files. See the example of "Multi-command processing" below:
Let's say xmlsec.jar for now, has 4 sub commands.
$ java -jar xmlsec.jar val tokenscript.xml
$ java -jar xmlsec.jar sign [-o tokenscript-signed.xml | -d output.dir/] tokenscript.xml
$ java -jar xmlsec.jar c14n [-o tokenscript-signed.xml | -d output.dir/] tokenscript.xml
$ java -jar xmlsec.jar verify tokenscript.xml
The first and last commands also have a long form (validate and canonic, respectively). The second and the third command has an output. If unspecified, it will simply be tokenscript-signed.xml
(that is, take the input file name, remove the extension and add -signed.xml
, following the convention set by Android apk files).
For example, sign
has --key
It should be possible to process multiple files in all of the commands. For example:
$ java -jar xmlsec.jar val */*.xml
Which validates every XML files under every directory.
For the commands that has an output, either -o
or -d
should be used. But if there are multiple input file, then only -d
is allowed. -d
causes the output of the same filename under the directory specified.
It should be possible to concatenate commands. The most typical use-cases are:
$ java -jar xmlsec.jar val c14n sign verify tokenscript1.xml tokenscript.2xml
This causes the tokenscript files to be validated, canonicalized, signed and verified, and outputs tokenscript1-signed.xml
and tokenscript2-signed.xml
. (the verify subcommand is smart enough to know that the output file should be used to verify not the original input file). If one of the sub-command fails, the next sub-command is not executed; but if an input file caused one of the sub-command to fail, the next file in queue is processed.
If you simply don't like the java --jar
syntax, then it's a different matter.
This errror seem to be in the schema. Can you make a PR and link back to this issue?
$ LANG=en_US java -classpath XMLValidator.jar:xpath2-1.2.0.jar:xercesImpl.jar:xercesSamples.jar:xml-apis.jar XMLValidator schema/tokenscript.xsd ../token-api-poc/tokenscripts/COFI.xml
COFI.xml is not valid because
cvc-identity-constraint.4.3: Key 'typeRef' with value 'Transfer' not found for identity constraint of element 'token'.
This was my next finding and actually I am not getting the ERROR that you reported either with XERCES or with oxygen editor but getting the error that you just reported.
Schema is expecting below XML block in the XML file. Do you mean that I should fix the schema and make type attribute optional?
Schema is expecting below XML block in the XML file. Do you mean that I should fix the schema and make type attribute optional?
Then the validator is working correctly except the error reported isn't human readable!
$ LANG=en_US java -classpath XMLValidator.jar:xpath2-1.2.0.jar:xercesImpl.jar:xercesSamples.jar:xml-apis.jar XMLValidator schema/tokenscript.xsd ../token-api-poc/tokenscripts/COFI.xml
COFI.xml is valid.
Keeping it open when there is a tool so the xsd 1.1 stuff can be uncommented as the documents on how to validate it gets updated.
The approach I would take is:
git clone https://git.shibboleth.net/git/xmlsectool
--sign
and --verify
to just sign
and verify
† 3.0.0 is an in-development version expected to come out in 2021 but 2.0.0 the current stable has very old libraries and has bugs with some of our processes. As a result of this approach, the code should be written with Java 11 as it is the default platform of xmlsectool Please try to use the latest Java API as backward compatibility is not desired.
‡ The current xmlsectool supports validation already, but it is not using Xerces with Schema 1.1 support (verified). Xerces seem to be the only one that can validate files that has entity references, which we need.
It is desirable to keep the possibility to sync up with future releases of xmlsectool, so you might choose to add instead of replace (e.g. add a subcommand to validation with Xerces instead of replacing what was there), and use sub-classing instead of changing much of the source code.
Further communication Updates from Telegram:
Weiwu: Stay connected you need to prioritise making the commandline tool that supports only validate (using the schema location in the xml header only - i have a reason for that) and canonicalisation, and support multi file processing and multi command processing. You should not proritise xml signing and verification as I can get by with sectool for the next a few weeks.
Why we need cannibalization? just want to know little bit more details about cannibalization in our existing stuff.
we actually don't need that, just entity dereference. So anything that can correctly read a XML file with entity reference in it and is able to serialise it into a single XML file will do the job for now.
xmlsectool vs Core Java xerces based validator:
Looks like the main focus of the xmlsectool is signing of the XML document. I also do not find the xmlsectool documentation clear. There is very little information available. If you have found different detailed official documentation than mentioned below Please direct me there.
https://wiki.shibboleth.net/confluence/display/XSTJ2/xmlsectool+V2+Home
https://wiki.shibboleth.net/confluence/display/CONCEPT/MetadataCorrectness#MetadataCorrectness-SchemaValidation.5
I do not see any special advantage of using xmlsectool for schema validation and entity de-referencing, So I am in favour of writing our own simple tool using xerces JAR.
Hi Weiwu,
I have completed the multi-file validation and attached is the Java Code. Can you create separate repository where I can commit the code. If I have created my private repository but can not add collaborator as I do not have enterprise git subscription.
I will start on entity de-referencing. Where we will store the de-referenced file? OR do we need to override the same XML? For now I can create the new XML file to save the result of de-referenced action.
Hi Weiwu,
I am committing my changes in forker repository - https://github.com/darakhbharat/TokenScript.git. Created new directory named xml-validation-against-xsd-1.1 to commit the changes.
Overview:
Here is the command:
$ java -classpath "xercesImpl.jar;xercesSamples.jar;xml-apis.jar;xpath2-1.2.0.jar;XMLValidator.jar" XMLValidator -val -deref H:/alphawallet/TokenScript/schema/tokenscript.xsd H:/alphawallet/tokenscripts/COFI.xml
Things needed to be improved:
You will not be able to reproduce this because I commended out the offending line in tokenscript.xsd
To reproduce this problem, uncomment the two lines mentioned in #388 and edit the test xml file (in this case COFI but any tokenscript file will do) to use the edited tokenscript.xsd then you can see this problem.
Note that I am already using the version of xerces that supports xml-schema 1.1