Open AlexaBennett opened 1 year ago
I assume you are meaning something like this?
ct <- data.frame(
s1 = c(1,1,1),
s2 = c(100, 0, 100),
s3 = c(50, 0, 50),
row.names = letters[1:3]
)
FeatureTable$
new(t(ct))$
keep_samples(function(x) sum(x) > 10)$
data
a b c
s2 100 0 100
s3 50 0 50
If so, then I normally add one more function call like keep_features(function(x) sum(x) > 0)
to remove any features with zero counts. Or there are some "wordy" helpers too: keep_features(that_are_present)
.
FeatureTable$
new(t(ct))$
keep_samples(function(x) sum(x) > 10)$
keep_features(that_are_present)$
data
a c
s2 100 100
s3 50 50
As to the reasoning behind it, I generally try to avoid magic or implicit behavior. In my opinion, having the keep_samples
function also have the potential to drop features is surprising/unexpected. Obviously, that is a pretty subjective criteria, and I'm not saying that I'm 100% right about it, so I would be willing to listen to an argument in favor of implicitly dropping features with zero counts after filtering by samples.
I can understand and appreciate that mindset. After I realized what was happening, I implemented a second step to remove all offending features. Maybe a more conservative approach is to throw a warning that one or more features now have a sum() = 0?
Yeah a warning message would be a good idea I think.
After filtering to remove my controls, several features now contain all zeros. I believe the default function should then remove all columns that sum to 0. If there is any reason to keep these features, maybe make a flag?