xorbitsai / xorbits

Scalable Python DS & ML, in an API compatible & lightning fast way.
https://xorbits.readthedocs.io
Apache License 2.0
1.1k stars 67 forks source link

ENH: Default enable arrow dtype for read_parquet if pandas>=2.1 #685

Closed codingl2k1 closed 12 months ago

codingl2k1 commented 12 months ago

What do these changes do?

Known issue:

Solution:

With this PR, the performance improves about ~25%

Related issue number

Fixes #xxxx

Check code requirements

codecov[bot] commented 12 months ago

Codecov Report

Merging #685 (5188d02) into main (d6aeeb5) will decrease coverage by 12.81%. The diff coverage is 60.00%.

@@             Coverage Diff             @@
##             main     #685       +/-   ##
===========================================
- Coverage   82.50%   69.70%   -12.81%     
===========================================
  Files        1025     1025               
  Lines       79331    79335        +4     
  Branches    16440    16442        +2     
===========================================
- Hits        65456    55299    -10157     
- Misses      11614    21765    +10151     
- Partials     2261     2271       +10     
Flag Coverage Δ
unittests 69.61% <40.00%> (-12.78%) :arrow_down:

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Changed Coverage Δ
...xorbits/_mars/dataframe/datasource/read_parquet.py 70.41% <60.00%> (-1.30%) :arrow_down:

... and 257 files with indirect coverage changes

:loudspeaker: Have feedback on the report? Share it here.