DiamondLightSource / cothread

Cooperative Python Threads and EPICS Channel Access bindings
Apache License 2.0
13 stars 9 forks source link

Malformed UTF-8 triggers unexpected exception #18

Closed Araneidae closed 4 years ago

Araneidae commented 4 years ago

Calling caget on Python3 on a PV which returns a string which is not in UTF-8 format triggers an exception which is not reported to the caller. For example:

>>> from cothread.catools import *
>>> caget ('LI-RF-AMPL-01:KLY:T1')
19.278066635131836
>>> caget ('LI-RF-AMPL-01:KLY:T1', format=FORMAT_CTRL)
Traceback (most recent call last):
  File "_ctypes/callbacks.c", line 232, in 'calling callback function'
  File "/scratch/hgs15624/local/venvs/burtinter-dhuzAW7m/lib/python3.7/site-packages/cothread/catools.py", line 587, in _caget_event_handler
    args.raw_dbr, args.type, args.count))
  File "/scratch/hgs15624/local/venvs/burtinter-dhuzAW7m/lib/python3.7/site-packages/cothread/dbr.py", line 829, in dbr_to_value
    raw_dbr.copy_attributes(result)
  File "/scratch/hgs15624/local/venvs/burtinter-dhuzAW7m/lib/python3.7/site-packages/cothread/dbr.py", line 244, in copy_attributes_ctrl
    other.units = py23.decode(ctypes.string_at(self.units))
  File "/scratch/hgs15624/local/venvs/burtinter-dhuzAW7m/lib/python3.7/site-packages/cothread/py23.py", line 56, in decode
    return s.decode('UTF-8')
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb0 in position 0: invalid start byte
Traceback (most recent call last):
  File "/scratch/hgs15624/local/venvs/burtinter-dhuzAW7m/lib/python3.7/site-packages/cothread/catools.py", line 155, in ca_timeout
    return event.Wait(timeout)
  File "/scratch/hgs15624/local/venvs/burtinter-dhuzAW7m/lib/python3.7/site-packages/cothread/cothread.py", line 768, in Wait
    self._WaitUntil(deadline)
  File "/scratch/hgs15624/local/venvs/burtinter-dhuzAW7m/lib/python3.7/site-packages/cothread/cothread.py", line 611, in _WaitUntil
    raise Timedout('Timed out waiting for event')
cothread.cothread.Timedout: Timed out waiting for event
Araneidae commented 4 years ago

There are two issues here:

  1. The UnicodeDecodeError occurs during a ctypes wrapped callback, and so is not reported to the caller, hence the second Timedout exception. This is straightforward to fix.
  2. The current policy of cothread of treating all strings as UTF-8 on Python 3 leads to increasing brittleness in the presence of strings using other encodings (Latin-1 in this case).

One option is to use decode('UTF-8', 'ignore') or decode('UTF-8', 'replace'), but this would disguise errors that perhaps should be treated as exceptions.

Araneidae commented 4 years ago

@willrogers , @thomascobb , I'm interested in your thoughts on this.

Araneidae commented 4 years ago

I've implemented decode('UTF-8', 'replace') in commit 17920a457c83f8e11fa0ded605e0712fcbbccd54 which fixes this.

willrogers commented 4 years ago

This seems like a reasonable solution to me.