areshytko / typedframe

Typed wrappers over pandas DataFrames with schema validation
https://typedframe.readthedocs.io/
MIT License
98 stars 7 forks source link

list attributes "TypeError: Cannot interpret 'list[str]' as a data type" #9

Open danodonovan opened 1 year ago

danodonovan commented 1 year ago

Apologies if this is an incorrect implementation, but I think there may be an issue with list types; for example

import pandas as pd
from typedframe import TypedDataFrame

class StringFrame(TypedDataFrame):
    schema = {
        "col1": int,
        "col2": str,
        "col3": list[str]
    }

df = pd.DataFrame({"col1": [0], "col2": ["test-str"], "col3": [["test-list"]]})

tdf = StringFrame.convert(df)

When calling StringFrame.convert(df) we see this exception

  File "/.../lib/python3.11/site-packages/pandas/core/dtypes/common.py", line 1691, in pandas_dtype
    npdtype = np.dtype(dtype)
              ^^^^^^^^^^^^^^^
TypeError: Cannot interpret 'list[str]' as a data type

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/dan/healx/healnet/healnet/example.py", line 14, in <module>
    tdf = StringFrame.convert(df)
          ^^^^^^^^^^^^^^^^^^^^^^^
  File "/.../lib/python3.11/site-packages/typedframe/pandas_.py", line 82, in convert
    raise AssertionError(f"Failed to convert column: {col}") from e
AssertionError: Failed to convert column: col3
DurivetMatthias commented 10 months ago

I'm running into the same issue. In my case, I'm trying to use Optional[str] as a type.

It would seem that the current codebase does not allow for the more complex types. It only supports int str bool etc. (the types are mentioned in the docs.)

@areshytko any plans to continue the work here? If not I might try extending it myself.