Closed andreasg123 closed 3 years ago
html.unescape might work:
from html import unescape
from json import dumps
from tifffile import xml2dict
meta = """<METADATA><Tags><AcquisitionTime>2021-03-03T12:51:11.0123443Z</AcquisitionTime><ImageScaling><ImageScaling>
<ImagePixelSize>6.5,6.5</ImagePixelSize>
</ImageScaling></ImageScaling><DetectorState><CameraState>
<ApplyCameraProfile>false</ApplyCameraProfile>
<ApplyImageOrientation>true</ApplyImageOrientation>
<ExposureTime>80005705.882353</ExposureTime>
<Frame>128,128,2048,2048</Frame>
<ImageOrientation>3</ImageOrientation>
</CameraState></DetectorState><StageXPosition>+000000209505.8000</StageXPosition><StageYPosition>+000000045936.2000</StageYPosition><FocusPosition>+000000021667.7820</FocusPosition><RoiCenterOffsetX>+000000000000.0000</RoiCenterOffsetX><RoiCenterOffsetY>+000000000000.0000</RoiCenterOffsetY></Tags><DataSchema><ValidBitsPerPixel>16</ValidBitsPerPixel></DataSchema><AttachmentSchema /></METADATA>"""
meta = unescape(meta)
print(meta)
meta = xml2dict(meta)
print(meta)
meta = dumps(meta)
print(meta)
Thanks. That looks good. I'll check if that would cause any issues with our files.
A CZI file has this metadata for a subblock:
Because part of the XML contents is escaped, it remains text when the metadata is converted to JSON, just as expected. Do you have suggestions for dealing with such files?