xsf / xeps

Hosts the markup for all XMPP Protocol Extensions.
https://xmpp.org/extensions/
Other
125 stars 119 forks source link

fixes #1235: Script that validates XEP-0001 conformance #1242

Closed guusdk closed 1 year ago

guusdk commented 1 year ago

This adds a simple bash script that checks if a provided file (filename to be provided as first argument) conforms to XEP-0001.

Expected to be executed from the directory that holds all xep XML files. Requires one argument: the file name of the xep to be validated, eg:

$ tools/validate-xep0001-conformance.sh xep-0010.xml

exit status will be non-zero upon validation failure.

requires: bash, xmllint

guusdk commented 1 year ago

I'm not much of a scripter. I went with the simplest tooling that I could think of: a basic Bash script with minimal dependencies, that should be easy to maintain.

This PR is in draft state, as not all validation checks as requested in the issue have been implemented. Before investing much more time in this, I'd like feedback on the approach.

I have performed cursory, but not full tests.

Example output (of a file modified to be invalid):

$ tools/validate-xep0001-conformance.sh xep-0002.xml
[PASS] DTD conformance against xep.dtd
[INFO] The filename ('xep-0002.xml') matches 'xep-[0-9]{4}.xml'.
[FAIL] The number in the filename ('xep-0002.xml') does not equals XPATH value /xep/header/number/text() ('0004') (but should).
[FAIL] XPATH value /xep/header/status/text() ('Activesdfs') does not equals a defined status (but should).
[INFO] implementation of validation of XPATH value /xep/header/type/text() ('Procedural') is pending!
[FAIL] XPATH value /xep/header/approver/text() ('Boardss') does not equals 'Board' or 'Council' (but should).
[INFO] implementation of validation version numbers in the revision blocks is pending!
[INFO] implementation of xep.xsl XML stylesheet usage is pending!
[INFO] implementation of inclusion of correct legal notice is pending!

Issues found!

Output for a valid file:

$ tools/validate-xep0001-conformance.sh xep-0010.xml
[PASS] DTD conformance against xep.dtd
[INFO] The filename ('xep-0010.xml') matches 'xep-[0-9]{4}.xml'.
[PASS] The number in the filename ('xep-0010.xml') equals XPATH value /xep/header/number/text() ('0010').
[PASS] XPATH value /xep/header/status/text() ('Obsolete') equals a defined status.
[INFO] implementation of validation of XPATH value /xep/header/type/text() ('SIG Formation') is pending!
[PASS] XPATH value /xep/header/approver/text() ('Board') equals either 'Board' or 'Council'.
[INFO] implementation of validation version numbers in the revision blocks is pending!
[PASS] XPATH value /xep/header/approver/text() ('Board') is 'Board' and XPATH value /xep/header/type/text() is not 'Standards Track'.
[INFO] implementation of xep.xsl XML stylesheet usage is pending!
[INFO] implementation of inclusion of correct legal notice is pending!

No issues found (but not all checks are implemented).
guusdk commented 1 year ago

~Added all but the validation for 'legal notice'. I both need to understand exactly what are valid definitions, and have more time than what I have right now to finish that.~ I've now added this validation.

I'm not particularly proud of the duplication in the implementation of the 'header type' validation, and am open to suggestions for improving things.

guusdk commented 1 year ago

All validation checks (see footnote) have now been implemented.

For testing, I have tried executing it against all XEP XML files in the root of this repository. I found that this test approach is a bit problematic, as the validation should be applied to changes to XEPs. When applied to preexisting XEPs, not all validation issues that are found can be fixed, I suppose.

One example of this is the validation that checks for the existence of an approver. Many XEPs already in the repository do not define one. Note that the stylesheet that is applied will use a default approver, when the XML does not define one: the rendered output will still have an approver.

With the approver-validation disabled, these files fail the validation script:

I do not think that these failures necessarily need to be blocking issues for this PR to be accepted, which is why I have dropped the 'draft' status of this PR.

footnote: For the validation of the legalese text in the XEP, I have opted to only check for the entity reference to be present, not for the entire text to be. Interesting challenges with markup pop up if you do that. In practise, existing XEPs all seem to use the entity reference anyway.

guusdk commented 1 year ago

I wonder if the validation that checks for the Approver to be either Board or Council covers the case of a XEP being Retracted by its author. The act of retracting does not seem to require approval of either Board or Council.