apache / arrow

Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing
https://arrow.apache.org/
Apache License 2.0
14.22k stars 3.46k forks source link

[Python] Writing pandas.DataFrame containing a object dtype column with int values fails #43298

Open MatanDascaluStoredot opened 1 month ago

MatanDascaluStoredot commented 1 month ago

Describe the bug, including details regarding any error messages, version, and platform.

I have a pandas.DataFrame with column 电源状态 whose unique values are :

df['电源状态'].unique() -> array(['静置', 'CC充电', 'CC放电', 'CV', 46], dtype=object) df['电源状态'].dtype -> dtype('O') (dtype auto generated during read)

Upon performing df.to_feather('test.feather'), the following error raises:

ArrowTypeError: ("Expected bytes, got a 'int' object", 'Conversion failed for column 电源状态 with type object')

Component(s)

Python

raulcd commented 1 month ago

I have no idea but this feels more like a pandas issue than a pyarrow issue. Can you share the stack trace to see where the problem is coming? Maybe @jorisvandenbossche knows best