Closed Lioscro closed 3 years ago
Merging #137 (af4997e) into master (53e705b) will decrease coverage by
0.04%
. The diff coverage is82.85%
.
@@ Coverage Diff @@
## master #137 +/- ##
==========================================
- Coverage 86.28% 86.24% -0.05%
==========================================
Files 65 65
Lines 4536 4530 -6
==========================================
- Hits 3914 3907 -7
- Misses 622 623 +1
Impacted Files | Coverage Δ | |
---|---|---|
cassiopeia/preprocess/cassiopeia_preprocess.py | 23.07% <0.00%> (ø) |
|
cassiopeia/preprocess/constants.py | 100.00% <ø> (ø) |
|
cassiopeia/preprocess/UMI_utils.py | 95.25% <50.00%> (-0.42%) |
:arrow_down: |
cassiopeia/preprocess/pipeline.py | 71.01% <77.77%> (-0.16%) |
:arrow_down: |
cassiopeia/preprocess/utilities.py | 96.56% <100.00%> (-0.07%) |
:arrow_down: |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update 53e705b...af4997e. Read the comment docs.
I noticed there were a few inconsistencies between the variable name & docstring vs the actual filtering behavior in
pp.utilities.filter_cells
andpp.utilities.filter_umis
. This PR addresses these inconsistencies.umi_read_thresh
argument topp.filter_molecule_table
tomin_reads_per_umi
to match the naming of the other variables. The corresponding argument topp.utilities.filter_umis
has also been changed.min_umi_per_cell
,min_avg_reads_per_umi
,min_reads_per_umi
) specifies the minimum number required for a cell/UMI to pass filtering. Previously, when a condition equaled any one of these values, the cell/UMI would be filtered out (due to the use of a<=
operator.pp.filter_molecule_table
.pp.utilities.filter_cells
to perform pandas operations more efficiently by usinggroupby
operations.Overall, these changes result in around 10x speedup of the
pp.filter_molecule_table
function.