Open bcalvert-graft opened 1 month ago
Hi @bcalvert-graft,
Thanks for the detailed report! Your insights are valuable. We appreciate you considering a quick fix PR.
You're absolutely right that the current behavior is counterproductive. We originally opted for raise e from e
to minimize error stack depth in our catch-retry logic for network fluctuations and rate limits. However, it seems the trade-off is not worth the current issue.
/unassign /assign @bcalvert-graft
Thanks @XuanYang-cn for the fast reply. I'll open up a PR as soon as I can
We originally opted for raise e from e to minimize error stack depth in our catch-retry logic for network fluctuations and rate limits.
Also, thank you for this clarification on the motivation for the raise err from err
choice, I hadn't considered that aspect of it
@XuanYang-cn and whoever else I should ping, I've opened up #2225 from a fork of the pymilvus
repo. The scope of the changes in that PR ended up being larger than the three spots I flagged in the issue description above, as I wrote grep
to find instances of the raise <e> from <e>
pattern and found more. I had to tweak those as well to get the test-case written above to pass.
Hi all,
Gentle bump on this thread. As mentioned above, I've opened up #2225. Are there additional things I should do to move this bugfix forward?
Is there an existing issue for this?
Describe the bug
Hello Milvus maintainers,
Thank you in advance for reading my bug-report.
There are three spots in
grpc_handler.py
(first spot, second spot and third spot) that use the following pattern to reraise anException
Respectfully, this
raise err from err
is not a great way to reraise exceptions; instead, as mentioned in the Python docs it should beWhy is the first one not great? It attaches the
err
Exception
as the__cause__
to itself. As I understand it, this is not the desired semantics for__cause__
and this breaks any tools that recursively unroll chainedException
s (e.g. for processing stack traces). As a concrete example of such a tool, we actually uncovered this seeming bug ingrpc_handler.py
when using Milvus as part of a larger Dask computation; specifically, as part of Dask's error-handling, the library uses a utility calledcollect_causes
,When
collect_causes
encounters anException
raised from one of the linked spots ingrpc_handler.py
it triggers an infinite loop of unrollinge.__cause__
, consequently leaking memory until the process is killed.Assuming you agree with the quick fix, I'm happy to submit a PR, but wanted to run it by you first to make sure I'm not missing something.
Cheers, Brian
Expected Behavior
When handling
Exception
s raised by the linked methods ingrpc_handler.py
, theException
s should not be attached as their own__cause__
Steps/Code To Reproduce behavior
In one process, run
In another process, run
The
In [6]:
step will throw anAssertionError
Environment details
Anything else?
No response