ConnectingEurope / eInvoicing-EN16931

Validation artefacts for the European eInvoicing standard EN 16931
Other
134 stars 53 forks source link

Infinite loop when using schematron from Python lxml #72

Closed alexis-via closed 6 years ago

alexis-via commented 6 years ago

I'm trying to use the CII or UBL schematron from Python lxml's lib, but I always get an infinite loop.

Here is a scenario to reproduce the bug:

cd /tmp
git clone https://github.com/CenPC434/validation.git

Then I run the following python script:

#! /usr/bin/python2.7

from lxml import etree
from lxml import isoschematron

f = open('/tmp/validation/cii/schematron/EN16931-CII-validation.sch', 'r')
sct_root = etree.parse(f)
print("sct_root=%s" % sct_root)
# print(etree.tostring(sct_root, pretty_print=True))
schematron = isoschematron.Schematron(sct_root)
print("schematron=%s" % schematron)

I get an infinite loop when reaching isoschematron.Schematron(). I also have an infinite loop with the UBL schematron.

I don't know if the bug is caused by the schematron or if it's in python's lxml lib. So I also reported a bug to lxml: https://bugs.launchpad.net/lxml/+bug/1783890

And you get an infinite loop when you reach the line isoschematron.Schematron().

phax commented 6 years ago

Hi. Are you sure its an infinite loop and that it simply doesnt take forever? It sometimes takes up to 30 minutes to generate the XSLT from the SCH files!!!

alexis-via commented 6 years ago

30 minutes !!! No, I didn't wait that long... I'm running it again now, let's see if it works if I wait long enough.

alexis-via commented 6 years ago

You are right, I had to wait longer ! After 20 minutes, it fails:

Traceback (most recent call last):
  File "./fx-schematron.py", line 10, in <module>                                                   
    schematron = isoschematron.Schematron(sct_root, store_report=True)                              
  File "/usr/local/lib/python2.7/dist-packages/lxml/isoschematron/__init__.py", line 285, in __init__                                                                                                   
    validator_xslt = self._compile(schematron, **compile_params)                                    
  File "src/lxml/xslt.pxi", line 600, in lxml.etree.XSLT.__call__                                   
lxml.etree.XSLTApplyError: Fail: This implementation of ISO Schematron does not work with           
        schemas using the "xslt2" query language.
phax commented 6 years ago

Does your library have a way to use precompiled XSLTs? That would simplify your life - right?

alexis-via commented 6 years ago

According to this post https://stackoverflow.com/questions/46767903/schematronparseerror-invalid-schematron-schema-for-isosts-schema there is no implementation of schematron in Python that support queryBinding="xslt2". So I guess we can close this bug report... as it seems there is no solution to this problem in Python :-(

phax commented 6 years ago

Sorry :(

alexis-via commented 6 years ago

@phax Do you mean that this xslt file here https://github.com/CenPC434/validation/blob/master/cii/xslt/EN16931-CII-validation.xslt is a pre-compiled version of the CII schematron, so that I could use this XSLT file directly ?

alexis-via commented 6 years ago

When I try to use the XSLT file provided in this git report "cii/xslt/EN16931-CII-validation.xslt", it fails on line 2129:

Traceback (most recent call last):
  File "./fx-xslt.py", line 14, in <module>
    transform = etree.XSLT(xslt_root)
  File "src/lxml/xslt.pxi", line 410, in lxml.etree.XSLT.__init__
lxml.etree.XSLTParseError: xsl:when : could not compile test expression '/rsm:CrossIndustryInvoice/rsm:SupplyChainTradeTransaction/ram:ApplicableHeaderTradeAgreement/ram:SellerTradeParty/ram:SpecifiedTaxRegistration/ram:ID[@schemeID = ('VAT', 'FC')] or /rsm:CrossIndustryInvoice/rsm:SupplyChainTradeTransaction/ram:ApplicableHeaderTradeAgreement/ram:SellerTaxRepresentativeTradeParty/ram:SpecifiedTaxRegistration/ram:ID[@schemeID = 'VAT']'

The small Python script I used for that is the following:

#! /usr/bin/python

from lxml import etree

xslt_file = open('/home/alexis/new_boite/dev/schematron-fx/cii/xslt/EN16931-CII-validation.xslt', 'r')
xslt_root = etree.parse(xslt_file)
print "xslt_root=", xslt_root

f = open('/home/alexis/tmp/factur-x-demov7.xml', 'r')
fx_xml_root = etree.parse(f)
print "fx_xml_root=", fx_xml_root

transform = etree.XSLT(xslt_root)
print "transform=", transform
newdom = transform(fx_xml_root)
print "newdom=", newdom
print etree.tostring(newdom, pretty_print=True)
kosek commented 6 years ago

Python doesn't support XSLT2 by default. However you can invoke external XSLT2 engine like Saxon. Or perhaps someone already created Python wrapper around Saxon/C. See http://www.saxonica.com/saxon-c/index.xml

phax commented 6 years ago

@alexis-via Yes, that was the idea. The XSLT you mention is a pre-compiled version of the Schematron. Because what usually happens is the following:

corymosiman12 commented 5 years ago

Any updates on a means to validate xslt2, either through lxml or other?