spacetelescope / vo-models

https://vo-models.readthedocs.io/
MIT License
4 stars 2 forks source link

UWS JobSummary jobInfo element does not support unbounded list of complex content #18

Open jwfraustro opened 7 months ago

jwfraustro commented 7 months ago

The UWS 1.1 specification of a jobSummary, describes the jobInfo element as:

...and in addition there is a element which can be used by implementations to include any extra information within the job description.

See: https://www.ivoa.net/documents/UWS/20161024/REC-UWS-1.1-20161024.html#jobobj

It is shown in the example XML as:

<uws:jobInfo>
<any>
<xml>
<thatyouwant/>
</xml>
</any>
</uws:jobInfo>

More explicitly, the UWS XML schema describes the element as:

<xs:element name="jobInfo" maxOccurs="1" minOccurs="0">
    <xs:annotation>
        <xs:documentation> This is arbitrary information that can be added to the job description by
            the UWS implementation. </xs:documentation>
    </xs:annotation>
    <xs:complexType>
        <xs:sequence>
            <xs:any namespace="##any" processContents="lax" minOccurs="0" maxOccurs="unbounded" />
        </xs:sequence>
    </xs:complexType>
</xs:element>

which is to say, 1 instance of <uws:jobinfo> and an unlimited number of complex sub-elements, essentially, "any XML you want".

This poses a bit of a problem for this package, and attempting to strictly type-annotate the children of jobInfo as described in the JobSummary model.

Currently, the job_info element is defined as list[str], which is patently incorrect-- it will produce repeated jobInfo elements, with simple content.

Just defining job_info as: job_info: str = element(...) to accept any string is a non-starter, as pydantic-xml seems to attempt to parse that when creating an instance of JobSummary, and naturally fails when finding it.

Subclassing JobSummary and adding specific anticipated values as sub-elements does work, but is rather clunky, and doesn't handle the specific case here, where ANY valid XML should be allowed.

This problem is fundamentally related to the dichotomy between rigorous, strongly typed models, and the loose nature of the UWS (and IVOA as a whole) specification. Hopefully this could be solved in the long-run with the work of the P3T working group's changes, but still should be addressed here, for backwards compatibility with any 1.1+ changes that are made.

See https://github.com/dapper91/pydantic-xml/issues/100 for a similar issue.