Closed Tomkaehst closed 5 years ago
@Tomkaehst, thanks for the report. Please provide the example data file so I can look into it.
Hi @tritemio ,
you can find an example file here: https://upload.uni-jena.de/data/5caa54291eaf20.02785906/Coumarin6_in_EtOH_2_1.ptu
In the meantime, I tried to comment out the ANSIString assignment to tag['data']
and everything now works as expected.
The decoding of the "File_Comment" tag seems to be the problem.
@Tomkaehst, right, the File_Comment
contains this binary-encoded string:
b'LAS X 2.0.1.14392\r\n\r\nPinhole: 58.69 \xb5m\r\nObjective: HC FLUOTAR L 25.0 WATER\r\nImage Format: 512 x 512\r\nScan Speed: 100 Hz\r\nZoom: 1.4\r\nFrame Average: 100\r\nDirection: Unidirectional\r\n\r\nWLL\r\n LaserLine 488: 75.0\r\n Laser Shutter: Open\r\n\r\nLaser (WLL, WLL) On 70.0\r\nLaser (Argon, visible) Off 0.0\r\nLaser (IR, MP) On\r\nLaser (IR2, FSOPO) On\r\nMFP Filter: Substrate \r\nPolarization Filter: NF 488\r\nNotch Filter: Empty\r\nX1-Port: Mirror \r\nScan Mode: xyt\r\nZPosition: -1.60 \xb5m\r\nTime Cycle Count: 25 ; Cycle Time: 600.0 s ; Complete Time: 14916.0 s\r\nSpectral detection range\r\nSP PMT 1: 500...550nm \r\n\r\nFLIM Detector: Intern\r\nAcquisition Mode: Frame Repetition 100\r\n\x00'
This is not properly encoded in UTF-8. In fact, if you try to decode it as UTF8 you get the error you reported for byte 0xb5 in position 32
.
The byte is printed as \xb5
in the string above and it clearly should be a μ.
We can ask python, what is the correct byte encoding for μ in UTF8:
>>> 'μ'.encode()
b'\xce\xbc'
Asking google I found this:
Unicode string:
'\xb5'
UTF8 bytestring:
b'\xc2\xb5'
And if I try to decode this in python:
>>> b'\xc2\xb5'.decode()
'µ'
this is a kind of slanted µ, (in the notebooks looks slanted but here on github no, so it is font-dependent).
Bottomline, I think PicoQuant here saved a broken string here... or maybe they are not using the UTF8 but some ancient encoding. Let me try, they are from Germany, so let's try latin1
:
>>> print(s.rstrip(b'\0').decode('latin1'))
LAS X 2.0.1.14392
Pinhole: 58.69 µm
Objective: HC FLUOTAR L 25.0 WATER
Image Format: 512 x 512
Scan Speed: 100 Hz
Zoom: 1.4
Frame Average: 100
Direction: Unidirectional
WLL
LaserLine 488: 75.0
Laser Shutter: Open
Laser (WLL, WLL) On 70.0
Laser (Argon, visible) Off 0.0
Laser (IR, MP) On
Laser (IR2, FSOPO) On
MFP Filter: Substrate
Polarization Filter: NF 488
Notch Filter: Empty
X1-Port: Mirror
Scan Mode: xyt
ZPosition: -1.60 µm
Time Cycle Count: 25 ; Cycle Time: 600.0 s ; Complete Time: 14916.0 s
Spectral detection range
SP PMT 1: 500...550nm
FLIM Detector: Intern
Acquisition Mode: Frame Repetition 100
Bingo, string decoded.
Bottomline: PQ uses here latin1
string encoding. I don't know if they use latin1
everywhere. Unless PQ confirms that they always use and continue to use latin1
, I would put a try..except to first try UTF8
and falling back to latin1
on error.
Thank you very much for the quick response @tritemio !
Closed by #36
Hello phconvert developers,
I'm trying to read a .ptu file from a PicoQuant HydraHarp2 (record type: 16843524) using load_ptu() and get this error in _ptu_read_tag()
Using the readPTU script from PicoQuants Github page I had a similar error and resolved it by changing the encoding from utf-8 to utf-16. This did not help this time.
Does anyone know what might cause the issue?
Thanks in advance, Tom