Pandas 1.4+ can read csv via the Arrow library. It is faster but does not support all the features of the default reader.
df = pd.read_csv("some.csv", engine="pyarrow")
Or we could bypass Pandas by using pyarrow directly. Even though I think it is a good idea for feather and HDF files for which the goal is maximum speed (because I don't expect people to often read files in those formats not written by LArray). For CSV, the most important point is to be able to read anything we throw at it, and only then do it quickly if possible. Pyarrow has many options for reading CSV files but I would rather stick with the Pandas API. The question is thus how stable is the PyArrow backend when only basic options are given.
Pandas 1.4+ can read csv via the Arrow library. It is faster but does not support all the features of the default reader.
df = pd.read_csv("some.csv", engine="pyarrow")
Or we could bypass Pandas by using pyarrow directly. Even though I think it is a good idea for feather and HDF files for which the goal is maximum speed (because I don't expect people to often read files in those formats not written by LArray). For CSV, the most important point is to be able to read anything we throw at it, and only then do it quickly if possible. Pyarrow has many options for reading CSV files but I would rather stick with the Pandas API. The question is thus how stable is the PyArrow backend when only basic options are given.