bluesky / ophyd-async

Hardware abstraction for bluesky written using asyncio
https://blueskyproject.io/ophyd-async
BSD 3-Clause "New" or "Revised" License
10 stars 23 forks source link

Device names should be reported within all errors #378

Open callumforrester opened 3 months ago

callumforrester commented 3 months ago

As a user I am likely to be working with multiple devices as once, sometimes multiple instances of the same device class, and I would like to know which device has raised an error at any given time. This can sometimes be hard to divine from stack traces, especially if device errors are wrapped up in RunEngine errors, for example:

  File "/venv/lib/python3.11/site-packages/bluesky/plan_stubs.py", line 1360, in repeated_plan
    yield from ensure_generator(plan())
  File "/venv/lib/python3.11/site-packages/bluesky/plan_stubs.py", line 1217, in one_shot
    yield from take_reading(list(detectors))
  File "/venv/lib/python3.11/site-packages/bluesky/plan_stubs.py", line 1101, in trigger_and_read
    return (yield from rewindable_wrapper(inner_trigger_and_read(),
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/venv/lib/python3.11/site-packages/bluesky/preprocessors.py", line 729, in rewindable_wrapper
    return (yield from plan)
            ^^^^^^^^^^^^^^^
  File "/venv/lib/python3.11/site-packages/bluesky/plan_stubs.py", line 1075, in inner_trigger_and_read
    yield from wait(group=grp)
  File "/venv/lib/python3.11/site-packages/bluesky/plan_stubs.py", line 504, in wait
    return (yield Msg('wait', None, group=group, timeout=timeout))
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/venv/lib/python3.11/site-packages/bluesky/preprocessors.py", line 251, in msg_mutator
    _s = yield msg
         ^^^^^^^^^
  File "/venv/lib/python3.11/site-packages/bluesky/run_engine.py", line 2284, in _status_object_completed
    raise FailedStatus(ret) from exc
bluesky.utils.FailedStatus: <AsyncStatus, task: <coroutine object StandardDetector.trigger at 0x7fc02987f040>, errored: TypeError("TetrammController.arm() missing 1 required positional argument: 'exposure'")>

In this case I can at least work out that error is coming from a Tetramm thanks to the TetrammController, but I still don't know which Tetramm. There are other cases (where more generic classes are used) that are even less obvious.

Reference: https://github.com/DiamondLightSource/i22-bluesky/issues/35#issuecomment-2160417939

Acceptance Criteria

callumforrester commented 3 months ago

@stan-dot suggests there should be some utility function to generate error messsages

callumforrester commented 3 months ago

Thinking about it, the example I gave is quite hard, because the error is actually raised from the standard library, and the device is written outside of ophyd-async. Maybe we need to change something about RunEngine or Status error reporting?

coretl commented 3 months ago

FailedStatus should still print the traceback somewhere further up, did this not make it out?

callumforrester commented 3 months ago

FailedStatus does print the traceback, that is not always useful in determining the culprit device. For example, the code that raised the error could be in a base class like StandardDetector, or in a device class that has multiple instances on the beamline.

coretl commented 3 months ago

Suggestion: We make AsyncStatus.wrap know about the Device/Signal it is wrapping, then include that in its repr. This would then make the error message something like:

bluesky.utils.FailedStatus: <AsyncStatus, device: tetramm1, task: <coroutine object StandardDetector.trigger at 0x7fc02987f040>, errored: TypeError("TetrammController.arm() missing 1 required positional argument: 'exposure'")>

Any better?

callumforrester commented 3 months ago

Does that mean AsyncStatus.wrap would no longer be usable on free functions?

stan-dot commented 3 months ago

the device could be optional, right?

coretl commented 3 months ago

Does that mean AsyncStatus.wrap would no longer be usable on free functions?

The simplest implementation would make this only usable on bound methods:

Device = TypeVar("Device", bound=Device)

class AsyncStatus(AsyncStatusBase):
    @classmethod
    def wrap(cls: Type[AS], f: Callable[Concatenate[Device, P], Awaitable]) -> Callable[Concatenate[Device, P], AS]:
        """Wrap an async function in an AsyncStatus."""

        @functools.wraps(f)
        def wrap_f(self: Device, *args: P.args, **kwargs: P.kwargs) -> AS:
            return cls(f(self, *args, **kwargs), device=self)

        # type is actually functools._Wrapped[P, Awaitable, P, AS]
        # but functools._Wrapped is not necessarily available
        return cast(Callable[Concatenate[Device, P], AS], wrap_f)

Do we have any use case to support free functions?

coretl commented 3 months ago

The simplest implementation would make this only usable on bound methods

Actually, scratch that, we could say that if the first argument is a Device then use its name, that's close enough:

class AsyncStatus(AsyncStatusBase):
    @classmethod
    def wrap(cls: Type[AS], f: Callable[P, Awaitable]) -> Callable[P, AS]:
        """Wrap an async function in an AsyncStatus."""

        @functools.wraps(f)
        def wrap_f(*args: P.args, **kwargs: P.kwargs) -> AS:
            if args and isinstance(args[0], Device):
                name = args[0].name
            else:
                name = None
            return cls(f(*args, **kwargs), name=name)

        # type is actually functools._Wrapped[P, Awaitable, P, AS]
        # but functools._Wrapped is not necessarily available
        return cast(Callable[P, AS], wrap_f)
callumforrester commented 3 months ago

I don't think this will catch every error though, AsyncStatus.wrap is a convenience function that is used in some places but not all.

stan-dot commented 3 months ago

this could be switched from convenience to load-bearing. 'better use it or write your own error catching'

coretl commented 3 months ago

I don't think this will catch every error though, AsyncStatus.wrap is a convenience function that is used in some places but not all.

This is correct, but it will those that come from a Status returning verb, i.e. set, trigger, kickoff, complete, stage, unstage. That's a reasonable set for starters, then we can tackle the others in a later ticket.