apache / arrow

Apache Arrow is the universal columnar format and multi-language toolbox for fast data interchange and in-memory analytics
https://arrow.apache.org/
Apache License 2.0
14.55k stars 3.54k forks source link

[Python] Fix PytestUnraisableExceptionWarning in substrait test #14766

Open milesgranger opened 1 year ago

milesgranger commented 1 year ago

Describe the bug, including details regarding any error messages, version, and platform.

test_named_table_invalid_table_name generates a PytestUnraisableExceptionWarning in the tests. The exception itself is:

Traceback (most recent call last): File "/home/joris/scipy/repos/arrow/python/pyarrow/tests/test_substrait.py", line 238, in table_provider raise Exception("Unrecognized table name") Exception: Unrecognized table name

The exception ought to be raised properly instead of 'hidden' as a warning.

Component(s)

Python

milesgranger commented 1 year ago

To expand, it appears the exception in the table_provider is raised in executor->Init then further down into engine::substrait::serde::DeserializePlans.

But while it seems to raise the ArrowInvalid properly, somewhere it should probably clear the Python exception raised by the table_provider; or better still, use/append the first exception's message as the ArrowInvalid message?

vibhatha commented 1 year ago

@milesgranger this has been bothering me too. I wasn't quite sure the best way to resolve this at that time. Thanks for raising this. I will look into this. Are you planning to work on this?

milesgranger commented 1 year ago

Great! Not planning on it, at least not in the near feature. :+1:

vibhatha commented 1 year ago

Okay, I will take a look at this. @pitrou or @jorisvandenbossche could you assign this to me?

kou commented 1 year ago

Assigned.

vibhatha commented 1 year ago

Thank you @kou

jorisvandenbossche commented 1 year ago

My first guess would be that the _create_named_table_provider is missing an except *: (to indicate to cython that this function can raise errors, and thus that it should check if any error is raised). See https://cython.readthedocs.io/en/latest/src/userguide/language_basics.html#error-return-values

vibhatha commented 1 year ago

@jorisvandenbossche thanks for the idea. I will check this.

vibhatha commented 5 months ago

I am removing my assignment, since I am mostly maintaining Arrow Java.