phax / ph-schematron

Java Schematron library that supports XSLT and native application
Apache License 2.0
110 stars 36 forks source link

XSLT vs. PURE - different number of failed-asserts #169

Closed robertmarkolwitzatnortal closed 3 months ago

robertmarkolwitzatnortal commented 4 months ago

Hi Philip,

in the project I am currently working on we have been using the Pure-Validation until now, but are currently evaluating XSLT-Validation since the number of rules and asserts are quite large and XSLT performance is way better on complex structures.

I have pre-generated the XSLT via and am loading the XSLT via SchematronResourceXSLT.fromClassPath.

While running our existing test-suite for both, pure and xslt validation, and comparing results, I found that the result schematron-output differs in terms of numbers of <fired-rule/> elements as well as number of <failed-assert/> elements.

Now, I understand that xslt requires a phase but the schematron files do not contain any <phase/> elements nor a defaultPhase attribute.

I was generating the xslt without specifying an explicit phase, since the schematron rules do not contain <phase/> elements nor a defaultPhase attribute.

I am attaching a zip archive with some files for further analysis (since they are quite large):

It would be fantastic if you could take some time to investigate this matter and provide a hint on where I might have taken a wrong turn.

From my understanding, outputs of XSLT-Validation and PURE-Validation can actually differ. But if no <phase/> is given in the schematron files and can hence not be specified for XSLT pre-generation, how can I enforce the same number of rules fired or at least the identical set of failed-asserts across both validation modes?

Thank you very much in advance for your help! :)

Cheers, Robert schematron_eu-1_0_0.zip

robertmarkolwitzatnortal commented 4 months ago

Hi @phax,

do you see any chance to provide feedback on my case over the next week?

Cheers, Robert

phax commented 4 months ago

Hi Robert, I hope so. Vacation and a busy work schedule makes it a bit tedious but I will try my best

phax commented 4 months ago

Here is the list of additional rules fired by Pure but not by XSLT:

/*/cac:ProcurementProjectLot[cbc:ID/@schemeName='Lot']/cac:TenderingProcess/cac:FrameworkAgreement/cac:SubsequentProcessTenderRequirement[cbc:Name/text()='buyer-categories']
/*/cac:ProcurementProjectLot[cbc:ID/@schemeName='Lot']/cac:TenderingProcess/cac:FrameworkAgreement/cac:SubsequentProcessTenderRequirement[cbc:Name/text()='buyer-categories']
/*/cac:ProcurementProjectLot[cbc:ID/@schemeName='Lot']/cac:TenderingProcess/cac:ProcessJustification[cbc:ProcessReasonCode/@listName='no-esubmission-justification']
/*/cac:ProcurementProjectLot[cbc:ID/@schemeName='Lot']/cac:TenderingProcess/cac:ProcessJustification[cbc:ProcessReasonCode/@listName='no-esubmission-justification'][$noticeSubType = '16']
/*/cac:ProcurementProjectLot[cbc:ID/@schemeName='Lot']/cac:TenderingTerms/cac:CallForTendersDocumentReference[cbc:DocumentType/text()='restricted-document']
/*/cac:ProcurementProjectLot[cbc:ID/@schemeName='Lot']/cac:TenderingTerms/cac:CallForTendersDocumentReference[cbc:DocumentType/text()='restricted-document']
/*/cac:ProcurementProjectLot[cbc:ID/@schemeName='Lot']/cac:TenderingTerms/cac:CallForTendersDocumentReference[cbc:DocumentType/text()='restricted-document'][$noticeSubType = '16']
/*/cac:ProcurementProjectLot[cbc:ID/@schemeName='Lot']/cac:TenderingTerms/cac:CallForTendersDocumentReference[cbc:DocumentType/text()='restricted-document'][$noticeSubType = '16']
/*/cac:ProcurementProjectLot[cbc:ID/@schemeName='Lot']/cac:TenderingTerms/cac:CallForTendersDocumentReference[not(cbc:DocumentType/text()='restricted-document')]
/*/cac:ProcurementProjectLot[cbc:ID/@schemeName='Lot']/cac:TenderingTerms/cac:CallForTendersDocumentReference[not(cbc:DocumentType/text()='restricted-document')]
/*/cac:ProcurementProjectLot[cbc:ID/@schemeName='Lot']/cac:TenderingTerms/cac:CallForTendersDocumentReference[not(cbc:DocumentType/text()='restricted-document')]
/*/cac:ProcurementProjectLot[cbc:ID/@schemeName='Lot']/cac:TenderingTerms/cac:CallForTendersDocumentReference[not(cbc:DocumentType/text()='restricted-document')]
/*/cac:ProcurementProjectLot[cbc:ID/@schemeName='Lot']/cac:TenderingTerms/cac:CallForTendersDocumentReference[not(cbc:DocumentType/text()='restricted-document')]
/*/cac:ProcurementProjectLot[cbc:ID/@schemeName='Lot']/cac:TenderingTerms/cac:CallForTendersDocumentReference[not(cbc:DocumentType/text()='restricted-document')]
/*/cac:ProcurementProjectLot[cbc:ID/@schemeName='Lot']/cac:TenderingTerms/cac:CallForTendersDocumentReference[not(cbc:DocumentType/text()='restricted-document')][$noticeSubType = '16']
/*/cac:ProcurementProjectLot[cbc:ID/@schemeName='Lot']/cac:TenderingTerms/cac:CallForTendersDocumentReference[not(cbc:DocumentType/text()='restricted-document')][$noticeSubType = '16']
/*/cac:ProcurementProjectLot[cbc:ID/@schemeName='Lot']/cac:TenderingTerms/cac:CallForTendersDocumentReference[not(cbc:DocumentType/text()='restricted-document')][$noticeSubType = '16']
/*/cac:ProcurementProjectLot[cbc:ID/@schemeName='Lot']/cac:TenderingTerms/cac:CallForTendersDocumentReference[not(cbc:DocumentType/text()='restricted-document')][$noticeSubType = '16']
/*/cac:ProcurementProjectLot[cbc:ID/@schemeName='Lot']/cac:TenderingTerms/cac:CallForTendersDocumentReference[not(cbc:DocumentType/text()='restricted-document')][$noticeSubType = '16']
/*/cac:ProcurementProjectLot[cbc:ID/@schemeName='Lot']/cac:TenderingTerms/cac:CallForTendersDocumentReference[not(cbc:DocumentType/text()='restricted-document')][$noticeSubType = '16']
/*/cac:TenderingProcess/cac:ProcessJustification[cbc:ProcessReasonCode/@listName='accelerated-procedure']
/*/cac:TenderingProcess/cac:ProcessJustification[cbc:ProcessReasonCode/@listName='accelerated-procedure'][$noticeSubType = '16']

So basically I assume it boils down to the specifics of how XSLT works. E.g. if this rule is matched:

/*/cac:ProcurementProjectLot[cbc:ID/@schemeName='Lot']/cac:TenderingProcess/cac:FrameworkAgreement/cac:SubsequentProcessTenderRequirement[cbc:Name/text()='buyer-categories'][$noticeSubType = '24']

then the following rule cannot be matched:

/*/cac:ProcurementProjectLot[cbc:ID/@schemeName='Lot']/cac:TenderingProcess/cac:FrameworkAgreement/cac:SubsequentProcessTenderRequirement[cbc:Name/text()='buyer-categories']

because the element

/*/cac:ProcurementProjectLot[cbc:ID/@schemeName='Lot']/cac:TenderingProcess/cac:FrameworkAgreement/cac:SubsequentProcessTenderRequirement

was already handled.

In Pure, the rules are executed in the order they are provided, whereas XSLT does it slightly different.

hth

robertmarkolwitzatnortal commented 3 months ago

Hi @phax ,

thank you for taking time to dig into it and providing comprehensive feedback. Closing the issue :) .