ome / bioformats

Bio-Formats is a Java library for reading and writing data in life sciences image file formats. It is developed by the Open Microscopy Environment. Bio-Formats is released under the GNU General Public License (GPL); commercial licenses are available from Glencoe Software.
https://www.openmicroscopy.org/bio-formats
GNU General Public License v2.0
380 stars 241 forks source link

LOF file error; [Fatal Error] :1:1: Content is not allowed in prolog. #3818

Open jil24 opened 2 years ago

jil24 commented 2 years ago

I hope this is the correct place to report issues with the LOF reader; I know it's an external contribution.

I have an image that will not read and results in this error:

[Fatal Error] :1:1: Content is not allowed in prolog.
Content is not allowed in prolog.
org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 1; Content is not allowed in prolog.
        at org.apache.xerces.parsers.DOMParser.parse(Unknown Source)
        at org.apache.xerces.jaxp.DocumentBuilderImpl.parse(Unknown Source)
        at loci.formats.in.LeicaMicrosystemsMetadata.LMSXmlDocument.initFromXmlString(LMSXmlDocument.java:109)
        at loci.formats.in.LeicaMicrosystemsMetadata.LMSXmlDocument.<init>(LMSXmlDocument.java:87)
        at loci.formats.in.LeicaMicrosystemsMetadata.LMSImageXmlDocument.<init>(LMSImageXmlDocument.java:37)
        at loci.formats.in.LeicaMicrosystemsMetadata.LofXmlDocument.<init>(LofXmlDocument.java:39)
        at loci.formats.in.LOFReader.isThisType(LOFReader.java:154)
        at loci.formats.FormatReader.isThisType(FormatReader.java:653)
        at loci.formats.ImageReader.isThisType(ImageReader.java:864)
        at loci.formats.ImageReader.getReader(ImageReader.java:193)
        at loci.formats.ImageReader.setId(ImageReader.java:844)
        at loci.formats.tools.ImageConverter.testConvert(ImageConverter.java:523)
        at loci.formats.tools.ImageConverter.main(ImageConverter.java:1141)
 *** One or more readers is misbehaving. See the debug output for more information. e.g.:
     loci.formats.in.LOFReader@52e6fdee -> java.lang.NullPointerException('null') ***
Exception in thread "main" loci.formats.UnknownFormatException: Unknown file format: C:\Users\lakej\Desktop\BMSEQIF-003-tonsil-high-c1-cd25positives.lof
        at loci.formats.ImageReader.getReader(ImageReader.java:202)
        at loci.formats.ImageReader.setId(ImageReader.java:844)
        at loci.formats.tools.ImageConverter.testConvert(ImageConverter.java:523)
        at loci.formats.tools.ImageConverter.main(ImageConverter.java:1141)

It's a 4-channel, 20-slice single position image.

I can share the image but it is quite large; I've 7-zipped it: https://www.dropbox.com/s/s9w3kfpzrhpbz86/DAPI-CD4-CD45-CD25.7z?dl=0

dgault commented 2 years ago

Hi @jil24, thank you for raising the issue and providing a sample file. I can reproduce the same error with the latest Bio-Formats 6.9.1

The issue seems to occur when attempting to read the XML embedded at the end of the LOF file. Instead of parsing an XML document it is instead only reading the below short String which I would have expected at the start of the file: LMS_Object_File

Looking at the file itself and the XML is present at the end of the file however the above String appears both at the start of the file and at the beginning of the XML. The start of the XML is as below, the start of which looks incorrect: LMS_Object_FileĪ⨀*(ion=""><ImageDescription><Channels><ChannelDescription DataType="0" ChannelTag="0" Resolution="16" NameOfMeasuredQuantity="" Min="0.000000e+000" Max="6.553500e+004" Unit="" LUTName="Blue" IsLUTInverted="0" BytesInc="0" BitInc="0"></ChannelDescription>

@XLEFReaderForBioformats, would you have any idea what has gone wrong with this file?

jil24 commented 2 years ago

I was actually able to get the file to open by replacing the malformed part in a hex editor... LAS X does not seem to care about the error, though.

XLEFReaderForBioformats commented 2 years ago

hi @jil24 , thank you for notifying us about this issue! could you please specify how the LOF file was created? did you create it in LASX and then open it in FIJI right away or was the LOF file also manipulated by other software in between? could you also please tell us which LASX version you used?

jil24 commented 2 years ago

Software version is LAS X 3.7.4.23463. The file was saved using the auto save functionality in LAS X, then renamed within LAS X. otherwise it was not modified.

XLEFReaderForBioformats commented 1 year ago

hi @jil24, the fix for LASX side behavior will be available in LASX 3.8.0 (as well as other future LASX releases such as 4.6.0 and 6.2.0). I adapted the bioformats side behavior only by adding a more verbose error messaging. Unfortunately, when reading of LOF XML fails, metadata needed for correct reading of image data is missing and images cannot be opened.

However, there are some workarounds possible:

jil24 commented 1 year ago

Glad to hear it. I am no longer working on this particular project, but I was able to complete it by changing the autosave format to TIF for the duration. I then extracted relevant metadata (stage coordinates) from the XLEF manually.