bluesky / ophyd

hardware abstraction in Python with an emphasis on EPICS
https://blueskyproject.io/ophyd
BSD 3-Clause "New" or "Revised" License
51 stars 79 forks source link

v2: Should we handle errors from signals that don't handle `CancelledError` #1152

Open callumforrester opened 1 year ago

callumforrester commented 1 year ago

Currently in v2 signals are expected to handle CancelledError on timeout and raise NotConnected

async def connect(self, ...):
    try:
        _connect(...)
    except CancelledError:
        raise NotConnected("reasons")

If we do not do this, then a timeout results in the whole program crashing with CancelledError. It is an easy mistake to make and a hard one to solve if the user has not read all of the documentation. It might be worth trying to come up with a more robust design.

In GDA, authors of new Scannables are expected to handle InterruptedException themselves. If they do not, scans do not behave as expected and can become deadlocked and unkillable. Many Scannables were written without handling the exception leading to strange and hard-to-diagnose behaviour. We should nip the same problem in the bud here.

Paging @coretl @evalott100 @RAYemelyanova and @danielballan

rosesyrett commented 1 year ago

Could I ask where this code snippet is from? I thought CancelledError is actually an asyncio.CancelledError, i.e. the _connect there should be able to raise it's own custom errors.

callumforrester commented 1 year ago

It's not directly from the code, it is a highly simplified version of Signal.connect() where the signal is using an epics backend, with all the indirection etc. removed. It is more generally how we are expected to write .connect()

coretl commented 1 year ago

The reason behind NotConnected is purely for squashability. Any NotConnected that bubble up to DeviceCollector are squashed into a summary, while others are reported with their tracebacks. If anything fails to connect then with DeviceCollector() raises