pytoolz / cytoolz

Cython implementation of Toolz: High performance functional utilities
Other
1.01k stars 72 forks source link

Throwing one exception in dicttoolz.itemmap() results in TypeError in cytoolz but NOT in toolz #162

Open mrwizard82d1 opened 2 years ago

mrwizard82d1 commented 2 years ago

I have a deadline tomorrow and so have not been able to identify a small code snippet exhibiting the behavior.

I am running a unit test in my package. This unit test expects a custom exception to be thrown.

Using toolz, this unit test passes. I have replaced all statements import toolz.curried as toolz with import cytoolz.curried as toolz in my code base. After this change, a single unit test fails with the error:

AssertionError: 
Expected: Expected a callable raising <class 'orchid.native_data_frame_adapter.DataFrameAdapterDateTimeError'>
     but: TypeError("'cytoolz.functoolz.curry' object is not iterable") of type <class 'TypeError'> was raised instead
eriknw commented 2 years ago

Huh, interesting. Thanks for the report, because we do want toolz and cytoolz to match. Any chance you could share the code where this occurs?

mrwizard82d1 commented 2 years ago

I can share the Python code. It's in the repository https://github.com/Reveal-Energy-Services/orchid-python-api; however, this code wraps a .NET API that is not generally available.

The test code module is in tests/test_native_data_frame_adapter.py. The failing test is test_net_data_frame_with_date_time_column_raises_error. This test expects an exception of type DataFrameAdapterDateTimeError to be thrown. This exception is thrown from NativeDataFrameAdapter.net_value_to_python_value (in the file orchid/native_data_frame_adapter.py); however, as the exception propagates upward through the toolz / cytoolz stack, this code raises an error.

Strangely, when run this morning, the error reported by PyCharm is that cytools.functional.curry has no attribute items:

AssertionError: 
Expected: Expected a callable raising <class 'orchid.native_data_frame_adapter.DataFrameAdapterDateTimeError'>
     but: AttributeError("'cytoolz.functoolz.curry' object has no attribute 'items'") of type <class 'AttributeError'> was raised instead

If available, I would be willing to get on a Zoom or Teams call with someone to walk through my issue this afternoon or evening (CDT -06:00). I'll try to look into it in more detail this afternoon my time to try to identify the issue more specifically.

mrwizard82d1 commented 2 years ago

I changed my unit test code to the following:

        try:
            sut.pandas_data_frame()
        except dfa.DataFrameAdapterDateTimeError as dte:
            print(dte)
            raise
        except AttributeError as ae:
            print(ae)
            raise

When I run this test, it fails with an AttributeError as I expect and the test runner prints the following details:

AILED (errors=1)
'cytoolz.functoolz.curry' object has no attribute 'items'

Error
Traceback (most recent call last):
  File "C:\src\orchid-python-api\tests\test_native_data_frame_adapter.py", line 378, in test_net_data_frame_with_date_time_column_raises_error
    sut.pandas_data_frame()
  File "C:\src\orchid-python-api\orchid\native_data_frame_adapter.py", line 92, in pandas_data_frame
    return _table_to_data_frame(self.dom_object.DataTable)
  File "C:\src\orchid-python-api\orchid\native_data_frame_adapter.py", line 166, in _table_to_data_frame
    result = toolz.pipe(data_table,
  File "cytoolz/functoolz.pyx", line 667, in cytoolz.functoolz.pipe
  File "cytoolz/functoolz.pyx", line 642, in cytoolz.functoolz.c_pipe
  File "cytoolz/functoolz.pyx", line 250, in cytoolz.functoolz.curry.__call__
  File "cytoolz/dicttoolz.pyx", line 207, in cytoolz.dicttoolz.keymap
  File "cytoolz/dicttoolz.pyx", line 228, in cytoolz.dicttoolz.keymap
  File "cytoolz/dicttoolz.pyx", line 71, in cytoolz.dicttoolz.get_map_iter
AttributeError: 'cytoolz.functoolz.curry' object has no attribute 'items'

The error appears to occur because a curried function (perhaps toolz.itemmap which is curried in the pipe and which calls my function, net_value_to_python_value, which throws the the originally sought DataFrameAdapterDateTimeError exception (inherits from TypeError) that seems to eventually cause the AttributeError.

mrwizard82d1 commented 2 years ago

I've attached a small zipped file that demonstrates the issue:

foo.zip

Hopefully you can confirm the issue with this file and address it appropriately.

Thanks!

eriknw commented 2 years ago

Aha. This is interesting--thanks again @mrwizard82d1. inspect.signature(cytoolz.itemmap) doesn't know that factory is a keyword argument with a default value:

>>> import cytoolz
>>> import inspect
>>> sig = inspect.signature(cytoolz.itemmap)
>>> sig
<Signature (func, d, factory)>
>>> sig.parameters['factory'].kind
<_ParameterKind.POSITIONAL_OR_KEYWORD: 1>
>>> sig.parameters['factory'].default
<class 'inspect._empty'>

Incorrect signature introspection will cause cytoolz.curry to behave improperly, which is what you're seeing.

In old versions of Python/Cython, inspect.signature didn't return a signature for Cython functions, and we were able to fallback to our own signature registry in toolz._signatures and cytoolz._signatures.

So, we can further investigate why inspect.signature(cython_func_with_keywords) gives incorrect signatures, and then hopefully find a fix.

We could also choose to use our signature registry over inspect.signature when appropriate. But, I want to use inspect.signature for builtin CPython functions, which I trust to be more up to date than our signature registry.

eriknw commented 2 years ago

Oh, this is a duplicate of #135. Perhaps supposedly fixed in Cython? https://github.com/cython/cython/issues/1864

mrwizard82d1 commented 2 years ago

Thanks, @eriknw, for looking into this issue!