We're notably using Pandas for DataFrame.read_csv. That could probably be replaced with pyarrow.csv.read_csv, which would allow removing Pandas from the list of dependencies, leaving it as an optional dependency only needed for the from_pandas and to_pandas methods (with Pandas imported within the method body).
Arrow seems to be a lot faster at reading CSV files and we need it anyway for reading and writing Parquet files, so it would probably allow dropping something we've never liked and have sought to replace.
We're notably using Pandas for
DataFrame.read_csv
. That could probably be replaced withpyarrow.csv.read_csv
, which would allow removing Pandas from the list of dependencies, leaving it as an optional dependency only needed for thefrom_pandas
andto_pandas
methods (with Pandas imported within the method body).Arrow seems to be a lot faster at reading CSV files and we need it anyway for reading and writing Parquet files, so it would probably allow dropping something we've never liked and have sought to replace.