biocore / songbird

Vanilla regression methods for microbiome differential abundance analysis
BSD 3-Clause "New" or "Revised" License
54 stars 25 forks source link

Off-by-one errors in filtering? #139

Closed fedarko closed 3 years ago

fedarko commented 3 years ago

It looks like the filtering code uses > when it should use >=:

https://github.com/biocore/songbird/blob/2727c04f1d9c7145a2a4a865cfce0e9904f0baa1/songbird/util.py#L154-L158

Because of this, features present in exactly 10 samples (or whatever min-feature-count is) will get filtered out and samples with exactly 1000 counts (or whatever min-sample-count is) will get filtered out, even though these are described as the minimum acceptable values:

https://github.com/biocore/songbird/blob/2727c04f1d9c7145a2a4a865cfce0e9904f0baa1/songbird/parameter_info.py#L30-L37