sissaschool / xmlschema

XML Schema validator and data conversion library for Python
MIT License
425 stars 72 forks source link

Behavior of processContents="lax" changed #298

Closed donmendelson closed 2 years ago

donmendelson commented 2 years ago

Behavior changed unexpectedly in xmlschema with regard to <xs:any processContents="lax"/>. In version 1.08, it worked as expected if the "foreign" schema was not found--no errors were produced. However, after updating to version 1.11, it throws an exception.

File "...\xmlschema\converters\jsonml.py", line 81, in element_encode
    raise XMLSchemaTypeError(msg.format(type(obj)))
xmlschema.exceptions.XMLSchemaTypeError: The first argument must be a sequence, <class 'str'> provided

In debugging, it was determined that the str in question is the label of an element in an included schema.

Was this an intended change? Is there a work-around?

donmendelson commented 2 years ago

I downgraded to version 1.10.0 and behavior was as expected, so only 1.11.0 is anomalous.

brunato commented 2 years ago

Hi, it's related to JsonMLConverter, that hat has an argument check that is changed between v1.10.0 and v1.11.0:

In v1.10.0 this check was:

if not isinstance(obj, MutableSequence) or not obj:
    raise XMLSchemaValueError("Wrong data format, a not empty list required: %r." % obj)

now it is:

if not isinstance(obj, MutableSequence):
    msg = "The first argument must be a sequence, {} provided"
    raise XMLSchemaTypeError(msg.format(type(obj)))
elif not obj:
    raise XMLSchemaValueError("The first argument is an empty sequence")

So a string shouldn't be accepted also in v1.10.0.

Probably something goes wrong in another changed part. The version of elementpath is the same in both cases?

Are you in condition of providing a sample (at least an object that has to be encoded to XML that produces the error)?

donmendelson commented 2 years ago

A minimalist test is not reproducing the problem. I will have to explore what else is causing a difference between v1.10.0 and v1.11.0.

brunato commented 2 years ago

The differences between v1.10.0 and v1.11.0 are mainly on error strings, revisited for translation. About encoding the only commit that could be involved with this the one related to the changes introduced to fix encoding of +/-inf and nan values (PR #295). Try to move back before and on that commit, to verify if this can be the cause of the problem.

donmendelson commented 2 years ago

I now have sample code that runs without exception with v1.10.0 but fails with v1.11.0. This was run with Python 3.9.

pyxmlschematest.zip

==================================================== FAILURES =======================================================
_______________________________________________ test_write_with_foreign _______________________________________________

    def test_write_with_foreign():
        instance = ['xt:Root', ['Container', ['Freeform', ['zz:ForeignSchema']]]]
        writer = XmlWriter()
        output_path = os.path.join(XML_FILE_DIR, 'xmlschematest-out2.xml')
        with open(output_path, 'wb') as f:
>           errors = writer.write_xml(instance, f)

XmlschemaTest\test_xmlschema.py:29:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
XmlschemaTest\xmlschema_validator.py:25: in write_xml
    data, errors = self.xsd.encode(instance, validation='lax', use_defaults=False,
..\AppData\Local\Programs\Python\Python39\lib\site-packages\xmlschema\validators\schemas.py:2131: in encode
    for result in self.iter_encode(obj, path, validation, *args, **kwargs):
..\AppData\Local\Programs\Python\Python39\lib\site-packages\xmlschema\validators\schemas.py:2116: in iter_encode
    yield from xsd_element.iter_encode(obj, validation, use_defaults=use_defaults,
..\AppData\Local\Programs\Python\Python39\lib\site-packages\xmlschema\validators\elements.py:1025: in iter_encode
    for result in xsd_type.content.iter_encode(element_data, validation, **kwargs):
..\AppData\Local\Programs\Python\Python39\lib\site-packages\xmlschema\validators\groups.py:1214: in iter_encode
    for result in xsd_element.iter_encode(value, validation, **kwargs):
..\AppData\Local\Programs\Python\Python39\lib\site-packages\xmlschema\validators\elements.py:1025: in iter_encode
    for result in xsd_type.content.iter_encode(element_data, validation, **kwargs):
..\AppData\Local\Programs\Python\Python39\lib\site-packages\xmlschema\validators\groups.py:1214: in iter_encode
    for result in xsd_element.iter_encode(value, validation, **kwargs):
..\AppData\Local\Programs\Python\Python39\lib\site-packages\xmlschema\validators\elements.py:1025: in iter_encode
    for result in xsd_type.content.iter_encode(element_data, validation, **kwargs):
..\AppData\Local\Programs\Python\Python39\lib\site-packages\xmlschema\validators\groups.py:1214: in iter_encode
    for result in xsd_element.iter_encode(value, validation, **kwargs):
..\AppData\Local\Programs\Python\Python39\lib\site-packages\xmlschema\validators\wildcards.py:562: in iter_encode
    yield from self.any_type.iter_encode(obj, validation, **kwargs)
..\AppData\Local\Programs\Python\Python39\lib\site-packages\xmlschema\validators\complex_types.py:775: in iter_encode
    results = [x for item in value for x in xsd_element.iter_encode(
..\AppData\Local\Programs\Python\Python39\lib\site-packages\xmlschema\validators\complex_types.py:775: in <listcomp>
    results = [x for item in value for x in xsd_element.iter_encode(
..\AppData\Local\Programs\Python\Python39\lib\site-packages\xmlschema\validators\elements.py:937: in iter_encode
    element_data = converter.element_encode(obj, self, level)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <xmlschema.converters.jsonml.JsonMLConverter object at 0x0000022653739240>, obj = 'zz:ForeignSchema'
xsd_element = XsdElement(name='{http://fixprotocol.io/xmlschematest2}ForeignSchema', occurs=[1, 1]), level = 3

    def element_encode(self, obj: Any, xsd_element: 'XsdElement', level: int = 0) -> ElementData:
        attributes: Dict[str, Any] = {}

        if not isinstance(obj, MutableSequence):
            msg = "The first argument must be a sequence, {} provided"
>           raise XMLSchemaTypeError(msg.format(type(obj)))
E           xmlschema.exceptions.XMLSchemaTypeError: The first argument must be a sequence, <class 'str'> provided

..\AppData\Local\Programs\Python\Python39\lib\site-packages\xmlschema\converters\jsonml.py:81: XMLSchemaTypeError
=============================================== short test summary info ===============================================
FAILED XmlschemaTest/test_xmlschema.py::test_write_with_foreign - xmlschema.exceptions.XMLSchemaTypeError: The first ...
============================================= 1 failed, 2 passed in 0.84s =============================================
brunato commented 2 years ago

It is worst than i expected, because it involves also the meta-schema build ... Related with the changes of commit 3c82a2922ef1064580430aaef1bfefaf306ec669.

It's the TypeError added to JsonMLConverter that is not intercepted and silenced like a ValueError. Simply silencing these error is wrong, I have to consider the validation argument and the particle that call the converter.

I will produce a fix ASAP.

Thank you!

brunato commented 2 years ago

There is also an error in XsdComplexType.iter_encode() that remove the list envelope to ['zz:ForeignSchema'], so a code error silenced by another code error. Going to fix them ...

brunato commented 2 years ago

It should be fixed with v1.11.1. Improved the encoding with xs;anyType.

Best regards

donmendelson commented 2 years ago

My application ran successfully. Thanks for the fix!