pandas-dev / pandas

Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
https://pandas.pydata.org
BSD 3-Clause "New" or "Revised" License
43.04k stars 17.72k forks source link

Pyarrow CSV Reader Integration Tracker #38872

Open lithomas1 opened 3 years ago

lithomas1 commented 3 years ago

Issue tracking Pyarrow engine integration in read_csv (for after #38370 is merged) Current unsupported options

arw2019 commented 3 years ago

FTR not all of these options are being targeted on the Arrow side (at least for the time being)

xref discussion in https://github.com/pandas-dev/pandas/issues/23697

phofl commented 2 years ago

@lithomas1 As long as we don't bump the minimum pyarrow version we are really limited what we can support.

Theoretically, we could write a function determining the unsupported actions based on the pyarrow version which is installed. This would allow us to circumvent the minimum pyarrow version if users have a newer pyarrow installed.

jbrockmendel commented 1 year ago

@lithomas1 status here?

lithomas1 commented 1 year ago

Sorry for the silence here (forgot to post), planning on updating this list at the very least (some options are deprecated). I'll try to start chipping away at this again this or next week.