tlambert03 / ome-types

native Python dataclasses for the OME data model
https://ome-types.readthedocs.io/en/latest/
MIT License
50 stars 9 forks source link

namespacing of Structured Annotation data? #68

Closed toloudis closed 1 year ago

toloudis commented 3 years ago

I'm having a little trouble with how StructuredAnnotations are intended to be validated. We have ome-tiffs with large Structured Annotation blocks that essentially are inlining other "arbitrary" xml as XMLAnnotations. (In this case, Zeiss CZI metadata has been stashed in there...please do not judge just yet!)

I can call from_xml and the xml properly is converted to ome-types without error. If I then call to_xml and from_xml again, I get an exception because of this:

Traceback (most recent call last):
  File "C:\Users\danielt\AppData\Local\Continuum\anaconda3\envs\cellbrowser-tools\lib\site-packages\xmlschema\validators\global_maps.py", line 127, in lookup
    obj = global_map[qname]
KeyError: '{http://www.openmicroscopy.org/Schemas/OME/2016-06}ImageDocument'

where I believe the OME schema namespace was added by ome-types.. Does that sound like what is going on here?

I'm honestly not sure whether the initial xml should fail to validate or not, but I believe the intent of the schema is to allow arbitrary (non-OME-schema) xml inside structured XMLannotations.

tlambert03 commented 3 years ago

might need some input from @jmuhlich on this one. i'm not sure but the issue might be right here where we use the OME namespace when encoding StructuredAnnotations?

I believe the intent of the schema is to allow arbitrary (non-OME-schema) xml inside structured XMLannotations.

totally agree. we probably need to provide a hook for you to provide your custom namespaces when encoding back to xml... @jmuhlich?

toloudis commented 3 years ago

I guess this is related to #66 , just adding a reference here.

tlambert03 commented 3 years ago

@toloudis, can you maybe extract some XML for me use to repeat this issue?

toloudis commented 3 years ago

Just pinging to say I haven't forgotten and hope to revisit this week. I will try to repro and if I can, I'll put some xml here.

toloudis commented 3 years ago

This repo https://github.com/AllenCell/quilt-data-access-tutorials/tree/feature/test_validation (specifically the branch feature/test_validation ) has some code in the Tutorial 3 notebook that shows a bit of trouble with Structured Annotations. In the first test file downloaded, I get a validation error with respect to a structured annotation that has an Experiment node inside it.

My goal is to confirm whether there are any true validation problems with the files being downloaded in that notebook. In this case it seems that from_tiff is successful, but then validate(to_xml(ome)) fails.

tlambert03 commented 3 years ago

Great thanks! Will be helpful to have for testing

tlambert03 commented 1 year ago

I think this can be closed. I just ran from_tiff on every file in the dataset listed above, and they all generate OME objects without error (even though many of them are ill-formed)