Closed tlambert03 closed 6 years ago
Could you please share a file that fails, or a full traceback?
sorry, should have done that the first time!
here's a file that fails: https://www.dropbox.com/s/c8g3hmaamlg4ego/tifffile_013_tagfail.tif?dl=0
and here's the full traceback:
In [2]: import tifffile as tf
In [3]: tf.__version__
Out[3]: '0.13.5'
In [4]: T = tf.TiffFile('tifffile_013_tagfail.tif')
---------------------------------------------------------------------------
UnicodeDecodeError Traceback (most recent call last)
~/anaconda/envs/tif/lib/python3.6/site-packages/tifffile/tifffile.py in bytes2str(b, encoding, errors)
9044 try:
-> 9045 return b.decode('utf-8', errors)
9046 except UnicodeDecodeError:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x90 in position 167: invalid start byte
During handling of the above exception, another exception occurred:
UnicodeDecodeError Traceback (most recent call last)
<ipython-input-4-b0a86dc325e1> in <module>()
----> 1 T = tf.TiffFile('tifffile_013_tagfail.tif')
~/anaconda/envs/tif/lib/python3.6/site-packages/tifffile/tifffile.py in __init__(self, arg, name, offset, size, multifile, movie, **kwargs)
1634
1635 # file handle is at offset to offset to first page
-> 1636 self.pages = TiffPages(self)
1637
1638 if self.is_lsm and (self.filehandle.size >= 2**32 or
~/anaconda/envs/tif/lib/python3.6/site-packages/tifffile/tifffile.py in __init__(self, parent)
2648 # always read and cache first page
2649 fh.seek(offset)
-> 2650 page = TiffPage(parent, index=0)
2651 self.pages.append(page)
2652 self._keyframe = page
~/anaconda/envs/tif/lib/python3.6/site-packages/tifffile/tifffile.py in __init__(self, parent, index, keyframe)
2941 index += tagsize
2942 try:
-> 2943 tag = TiffTag(self.parent, data[index:index+tagsize])
2944 except TiffTag.Error as e:
2945 warnings.warn(str(e))
~/anaconda/envs/tif/lib/python3.6/site-packages/tifffile/tifffile.py in __init__(self, parent, tagheader, **kwargs)
3909 # TIFF ASCII fields can contain multiple strings,
3910 # each terminated with a NUL
-> 3911 value = bytes2str(stripascii(value[0]).strip())
3912 else:
3913 if code in TIFF.TAG_ENUM:
~/anaconda/envs/tif/lib/python3.6/site-packages/tifffile/tifffile.py in bytes2str(b, encoding, errors)
9045 return b.decode('utf-8', errors)
9046 except UnicodeDecodeError:
-> 9047 return b.decode('cp1252', errors)
9048
9049
~/anaconda/envs/tif/lib/python3.6/encodings/cp1252.py in decode(self, input, errors)
13
14 def decode(self,input,errors='strict'):
---> 15 return codecs.charmap_decode(input,errors,decoding_table)
16
17 class IncrementalEncoder(codecs.IncrementalEncoder):
UnicodeDecodeError: 'charmap' codec can't decode byte 0x90 in position 167: character maps to <undefined>
thanks!
Thank you. The dtype of the ImageID
tag does not match the value. Probably better to coerce the tag dtype to bytes and issue a warning.
ah! thank you, makes sense.
Are you suggesting I submit a pull request that accomplishes that, or handle it outside of tifffile? I spent a little time trying to come up with a modification to TiffTag.__init__()
to check that the declared dtype of a tag matches the value type, but I'm not sure how to verify the dtype of a tag value without a try/catch that seems a bit ugly and not worthy of a pull request.
Should be fixed at https://www.lfd.uci.edu/~gohlke/code/tifffile.py.html
I released 0.14 which has the updated source.
thank you both!
after updating to 0.13.5 from 0.12.1, I found that a number of my tiff files were causing an issue during instantiation of a TiffFile object, due to non-standard, non-unicode Tiff tags in the headers. In 0.12.1, this was not an issue (they were mostly just ignored), but it looks like the expanded
bytes2str
function in tifffile 0.13+ is not happy with these tags and raises an exception. (these Tiff files were written with custom Labview microscope acquisition software and contain a labview binary format tag in the header)For me, simple falling back to
str(b)
after trying the 'utf-8' and 'cp1252' encodings fixes everything and lets me use 0.13.5 without any additional modifications... but I'm not sure whether there is a better way to handle this. Consider this pull request more of an "issue with a possible fix", and feel free to reject and suggest a better solution.