chanzuckerberg / cellxgene-census

CZ CELLxGENE Discover Census
https://chanzuckerberg.github.io/cellxgene-census/
MIT License
76 stars 20 forks source link

[builder] enable Arrow Dictionary feature flag #1064

Closed bkmartinjr closed 4 months ago

bkmartinjr commented 5 months ago

This PR enables the use of Arrow dictionary (aka TileDB enum, Pandas Categorical, ...) in building the Census. Affects various string columns in the obs dataframe which contain repetitive labels, such as cell_type. Primary impact is more efficient memory use for end-user (reader) of Census obs dataframe.

Fixes #604

Other changes:

codecov[bot] commented 5 months ago

Codecov Report

Attention: Patch coverage is 33.33333% with 2 lines in your changes are missing coverage. Please review.

Project coverage is 81.35%. Comparing base (a5dbdef) to head (d1dde47).

Files Patch % Lines
...llxgene_census_builder/build_soma/validate_soma.py 0.00% 2 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #1064 +/- ## ========================================== + Coverage 81.33% 81.35% +0.01% ========================================== Files 73 73 Lines 5566 5566 ========================================== + Hits 4527 4528 +1 + Misses 1039 1038 -1 ``` | [Flag](https://app.codecov.io/gh/chanzuckerberg/cellxgene-census/pull/1064/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=chanzuckerberg) | Coverage Δ | | |---|---|---| | [unittests](https://app.codecov.io/gh/chanzuckerberg/cellxgene-census/pull/1064/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=chanzuckerberg) | `81.35% <33.33%> (+0.01%)` | :arrow_up: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=chanzuckerberg#carryforward-flags-in-the-pull-request-comment) to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.