Closed diffubik closed 1 year ago
I spent a little time investigating what's supposed to happen here - the spec isn't super clear what should happen if an exception is raised by a GRPC servicer method (or I can't see one).
StopIteration
in particular isn't really an error. It's just the way all python iterators say "there's no more items in this iterator".
Do you have a simple(ish) way to trigger this issue?
For instance when passing an invalid engineId, the gRPC caller has little info about what went wrong
I actually realize now that the error gets logged on the server side, but I only read the part of the stack trace from WSGI application error
onwards...
Traceback (most recent call last):
File "/home/sdiff/src/stable-diffusion-grpcserver/sdgrpcserver/services/generate.py", line 412, in Generate
with self._manager.with_engine(request.engine_id) as engine:
File "/home/sdiff/.conda/envs/sd-grpc-server/lib/python3.10/contextlib.py", line 135, in __enter__
return next(self.gen)
File "/home/sdiff/src/stable-diffusion-grpcserver/sdgrpcserver/manager.py", line 1379, in with_engine
spec = self._find_spec(id=id)
File "/home/sdiff/src/stable-diffusion-grpcserver/sdgrpcserver/manager.py", line 1274, in _find_spec
res = self._find_specs(id=id, model_id=model_id)
File "/home/sdiff/src/stable-diffusion-grpcserver/sdgrpcserver/manager.py", line 1255, in _find_specs
raise ValueError("Must provide one of id or model_id")
ValueError: Must provide one of id or model_id
WSGI application error
Traceback (most recent call last):
File "/home/sdiff/.conda/envs/sd-grpc-server/lib/python3.10/site-packages/twisted/python/threadpool.py", line 260, in <lambda>
inContext.theWork = lambda: context.call( # type: ignore[attr-defined]
File "/home/sdiff/.conda/envs/sd-grpc-server/lib/python3.10/site-packages/twisted/python/context.py", line 117, in callWithContext
return self.currentContext().callWithContext(ctx, func, *args, **kw)
File "/home/sdiff/.conda/envs/sd-grpc-server/lib/python3.10/site-packages/twisted/python/context.py", line 84, in callWithContext
self.contexts.pop()
File "/home/sdiff/.conda/envs/sd-grpc-server/lib/python3.10/site-packages/twisted/web/wsgi.py", line 530, in run
self.started = True
--- <exception caught here> ---
File "/home/sdiff/.conda/envs/sd-grpc-server/lib/python3.10/site-packages/twisted/web/wsgi.py", line 500, in run
for elem in appIterator:
File "/home/sdiff/src/stable-diffusion-grpcserver/sdgrpcserver/sonora/wsgi.py", line 123, in _do_grpc_request
yield from self._do_streaming_response(
builtins.RuntimeError: generator raised StopIteration
For this sort of invalid request, we do set some context data here: https://github.com/hafriedlander/stable-diffusion-grpcserver/blob/dev/sdgrpcserver/services/generate.py#L529-L531 and don't propagate the exception. Without this change that context is not returned to the user and they only get a generic "unknown" gRPC error.
StopIteration
in particular isn't really an error. It's just the way all python iterators say "there's no more items in this iterator".
If it bubbles up from here though, it is a real error: https://github.com/hafriedlander/stable-diffusion-grpcserver/blob/dev/sdgrpcserver/sonora/wsgi.py#L136 - if there are zero answers, next()
cannot work.
Ah ha. Cool, thanks. I think that's incorrect behaviour by Sonora - it should be OK to just return nothing from a streaming rpc handler. Let me look it up...
From what I can tell, it's up to the individual Servicer instances to catch exceptions and call context.abort if there's an issue. Dropping an exception out of a Servicer isn't allowed.
Hmm. Don't see anything in the spec, but the Google grpc adapter does seem to work with empty streams.
I think I'd prefer to fix the behaviour here in Sonora, and then wrap all the Servicer methods with something that catches all exceptions and turns it into a context.Abort call. (There are now three gateways that access the Servicer, so fixing this in a way that fixes it for all of them is better.)
sgtm
If an error like
StopIteration
is thrown in _do_streaming_response, the HTTP response does not contain any of the error details and the grpc status defaults to2 - Unknown
.This change makes some of the header/trailer setup more reusable and passes the trailer data to the response in the error case.