root-11 / tablite

multiprocessing enabled out-of-memory data analysis library for tabular data.
MIT License
37 stars 8 forks source link

Skip empty rows and fixes #141

Closed realratchet closed 8 months ago

realratchet commented 8 months ago

Added ability to skip completely empty or partially empty rows in text/excel/ods readers. Replaced ods pyexcel reader with pandas reader. Both of them read the document in full but the pyexcel has a bad tendency of skipping empty rows at the beginning and mid-document this ruins parity. Fixed an issue with excel reader where it tried to determine the document shape from the last row which caused issues when last row is of different shape than others. Instead it will create bounding box from entire document instead.

codecov-commenter commented 8 months ago

Codecov Report

Attention: Patch coverage is 96.15385% with 2 lines in your changes are missing coverage. Please review.

Project coverage is 81.96%. Comparing base (19ad711) to head (2fdecb1).

Files Patch % Lines
tablite/import_utils.py 97.29% 1 Missing :warning:
tablite/nimlite.py 80.00% 1 Missing :warning:

:exclamation: Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #141 +/- ## ========================================== + Coverage 81.86% 81.96% +0.10% ========================================== Files 27 27 Lines 4135 4159 +24 ========================================== + Hits 3385 3409 +24 Misses 750 750 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.