Closed RandomFractals closed 1 year ago
Loading Chicago crimes data with pandas is much slower when compared to pyarrows, duckdb, or polars and takes about 26 seconds:
https://github.com/RandomFractals/chicago-crimes/blob/main/notebooks/chicago-crimes-pandas.ipynb
See new With Pandas and With Matplotlib sections in docs:
https://github.com/RandomFractals/chicago-crimes#with-pandas
to compare CSV data loading time with Polars (#7), PyArrow (#17) and DuckDB (#4).
Pandas read CSV docs: https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html
Also, add some plots from the old Chicago crimes Jupyter notebook:
https://github.com/RandomFractals/ChicagoCrimes/blob/master/notebooks/all-chicago-crime-charts.ipynb