marksgraham / OCT-Converter

Tools for extracting the raw optical coherence tomography (OCT) and fundus data from proprietary file formats.
https://pypi.org/project/oct-converter/
MIT License
195 stars 70 forks source link

decode error when opening a .oct image: UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe4 in position 66: invalid continuation byte #77

Closed kli30 closed 1 year ago

kli30 commented 1 year ago

Hi marksgraham, I encountered an error when trying to open a .oct image, can you give me some clue? thank you!

Below is the error message:

from oct_converter.readers import BOCT fn = r'../raw-test/oct-raw/{8968AB72-5A10-46CD-99D4-50CF5F5F8974}.oct' oct = BOCT(fn) oct.read_oct_volume()


UnicodeDecodeError Traceback (most recent call last) Cell In [21], line 1 ----> 1 oct.read_oct_volume()

File ~/miniconda3/envs/unet/lib/python3.9/site-packages/oct_converter/readers/boct.py:170, in BOCT.read_oct_volume(self, diskbuffered) 167 self.patient_id = self.filepath.stem 169 # Lazily parse the file without loading frame pixels --> 170 oct = self.file_structure.parse_file(self.filepath) 171 header = oct.header 172 self.frames = FrameGenerator(oct.data)

File ~/miniconda3/envs/unet/lib/python3.9/site-packages/construct/core.py:309, in Construct.parse_file(self, filename, contextkw) 305 r""" 306 Parse a closed binary file. See parse(). 307 """ 308 with open(filename, 'rb') as f: --> 309 return self.parse_stream(f, contextkw)

File ~/miniconda3/envs/unet/lib/python3.9/site-packages/construct/core.py:300, in Construct.parse_stream(self, stream, **contextkw) 298 context._params = context 299 try: --> 300 return self._parsereport(stream, context, "(parsing)") 301 except CancelParsing: 302 pass

File ~/miniconda3/envs/unet/lib/python3.9/site-packages/construct/core.py:312, in Construct._parsereport(self, stream, context, path) 311 def _parsereport(self, stream, context, path): --> 312 obj = self._parse(stream, context, path) 313 if self.parsed is not None: 314 self.parsed(obj, context)

File ~/miniconda3/envs/unet/lib/python3.9/site-packages/construct/core.py:2120, in Struct._parse(self, stream, context, path) 2118 for sc in self.subcons: 2119 try: -> 2120 subobj = sc._parsereport(stream, context, path) 2121 if sc.name: 2122 obj[sc.name] = subobj

File ~/miniconda3/envs/unet/lib/python3.9/site-packages/construct/core.py:312, in Construct._parsereport(self, stream, context, path) 311 def _parsereport(self, stream, context, path): --> 312 obj = self._parse(stream, context, path) 313 if self.parsed is not None: 314 self.parsed(obj, context)

File ~/miniconda3/envs/unet/lib/python3.9/site-packages/construct/core.py:2653, in Renamed._parse(self, stream, context, path) 2651 def _parse(self, stream, context, path): 2652 path += " -> %s" % (self.name,) -> 2653 return self.subcon._parsereport(stream, context, path)

File ~/miniconda3/envs/unet/lib/python3.9/site-packages/construct/core.py:312, in Construct._parsereport(self, stream, context, path) 311 def _parsereport(self, stream, context, path): --> 312 obj = self._parse(stream, context, path) 313 if self.parsed is not None: 314 self.parsed(obj, context)

File ~/miniconda3/envs/unet/lib/python3.9/site-packages/construct/core.py:2120, in Struct._parse(self, stream, context, path) 2118 for sc in self.subcons: 2119 try: -> 2120 subobj = sc._parsereport(stream, context, path) 2121 if sc.name: 2122 obj[sc.name] = subobj

File ~/miniconda3/envs/unet/lib/python3.9/site-packages/construct/core.py:312, in Construct._parsereport(self, stream, context, path) 311 def _parsereport(self, stream, context, path): --> 312 obj = self._parse(stream, context, path) 313 if self.parsed is not None: 314 self.parsed(obj, context)

File ~/miniconda3/envs/unet/lib/python3.9/site-packages/construct/core.py:2653, in Renamed._parse(self, stream, context, path) 2651 def _parse(self, stream, context, path): 2652 path += " -> %s" % (self.name,) -> 2653 return self.subcon._parsereport(stream, context, path)

File ~/miniconda3/envs/unet/lib/python3.9/site-packages/construct/core.py:312, in Construct._parsereport(self, stream, context, path) 311 def _parsereport(self, stream, context, path): --> 312 obj = self._parse(stream, context, path) 313 if self.parsed is not None: 314 self.parsed(obj, context)

File ~/miniconda3/envs/unet/lib/python3.9/site-packages/construct/core.py:2120, in Struct._parse(self, stream, context, path) 2118 for sc in self.subcons: 2119 try: -> 2120 subobj = sc._parsereport(stream, context, path) 2121 if sc.name: 2122 obj[sc.name] = subobj

File ~/miniconda3/envs/unet/lib/python3.9/site-packages/construct/core.py:312, in Construct._parsereport(self, stream, context, path) 311 def _parsereport(self, stream, context, path): --> 312 obj = self._parse(stream, context, path) 313 if self.parsed is not None: 314 self.parsed(obj, context)

File ~/miniconda3/envs/unet/lib/python3.9/site-packages/construct/core.py:2653, in Renamed._parse(self, stream, context, path) 2651 def _parse(self, stream, context, path): 2652 path += " -> %s" % (self.name,) -> 2653 return self.subcon._parsereport(stream, context, path)

File ~/miniconda3/envs/unet/lib/python3.9/site-packages/construct/core.py:312, in Construct._parsereport(self, stream, context, path) 311 def _parsereport(self, stream, context, path): --> 312 obj = self._parse(stream, context, path) 313 if self.parsed is not None: 314 self.parsed(obj, context)

File ~/miniconda3/envs/unet/lib/python3.9/site-packages/construct/core.py:704, in Adapter._parse(self, stream, context, path) 702 def _parse(self, stream, context, path): 703 obj = self.subcon._parsereport(stream, context, path) --> 704 return self._decode(obj, context, path)

File ~/miniconda3/envs/unet/lib/python3.9/site-packages/construct/core.py:1610, in StringEncoded._decode(self, obj, context, path) 1609 def _decode(self, obj, context, path): -> 1610 return obj.decode(self.encoding)

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe4 in position 66: invalid continuation byte

marksgraham commented 1 year ago

Hi,

Firstly, are you sure it is a file from a Bioptigen scanner? I ask because Optovue scanners also have the .oct extension and are read with this reader instead.

If you are sure, are you able to share the file to help with debugging?

kli30 commented 1 year ago

Thanks for your feedback. I also tried POCT, this time it complained: can not find a matching txt file. SO, probably this will be a BOCT thing. The image was uploaded here: https://www.dropbox.com/s/jgd3pi4g64n63wr/%7B28C9AAC8-15D3-4343-91C9-7F450EB9519D%7D.oct?dl=0. Please let me know after you download it. Thank you!

marksgraham commented 1 year ago

Ill try to take a look soon. Also tagging @Dbrown411 who wrote the Bioptigen reader in case he has any ideas

kli30 commented 1 year ago

thank both of you, @Dbrown411 @marksgraham

marksgraham commented 1 year ago

Hi @kli30 I haven't been able to make much progress on reading as a BOCT. IT crashes early on, after reading the first two fields with values magicNumber=2771273, version=52964. The only other Bioptigen OCT file I have access to has values magicNumber=209561509, version=266 which doesn't really help me clarify if this is actually a Bioptigen file.

Have you got any other info about the scan? Do you have the images exported by another means / access to the scanner?

Dbrown411 commented 1 year ago

I should be able to check if this is a bioptigen file or not. give me a bit.

Dillon

Dbrown411 commented 1 year ago

This does not appear to be a bioptigen .OCT file. Two hints:

  1. At least in my experience, bioptigen files have a capitalized ".OCT" extension
  2. Opening the file in a hex editor, the headers are quite different:

Bioptigen .OCT: Bioptigen .OCT

File in question: File in question

marksgraham commented 1 year ago

Thanks Dillon!

@kli30 I ran this through every reader we have and no luck, so I don't think we're going to be able to make progress here without more information re the scantype

kli30 commented 1 year ago

thank both of you for the feedback. Unfortunately, I do not have more information avaialbe beside the file... thanks again.