CCBR / MOSuite

MultiOmicsSuite: R package for downstream multi-omics analysis
https://ccbr.github.io/reneeTools/
Other
1 stars 0 forks source link

change count thresholds to be fractions instead of integers #63

Open kelly-sovacool opened 2 months ago

kelly-sovacool commented 2 months ago

previously, ccbrpipeliner used fractions which was more portable across groups of different sizes.

current code straight from nidap: https://github.com/CCBR/reneeTools/blob/79e612e588083869d139dd497f480676dda280bc/R/filter.R#L247-L300

kelly-sovacool commented 2 months ago

@phoman14 do you have any thoughts on this?

phoman14 commented 2 months ago

do you mean that if a dataframe has 14 samples then Minimum_Number_of_Samples_with_Nonzero_Counts_in_Total = 0.5 instead of Minimum_Number_of_Samples_with_Nonzero_Counts_in_Total = 7?

kelly-sovacool commented 2 months ago

do you mean that if a dataframe has 14 samples then Minimum_Number_of_Samples_with_Nonzero_Counts_in_Total = 0.5 instead of Minimum_Number_of_Samples_with_Nonzero_Counts_in_Total = 7?

Yes exactly. This is @kopardev's suggestion.

phoman14 commented 2 months ago

I think this is a fine way to do it. In my head It is easier to specify the exact number but if we make the input a fraction we could always calculate the fraction upstream from the exact number. The other consideration is the input format. We should include an error check to make sure the input is in the correct format. I could see a user not understanding the format in enter any of the following 0.5, 50% or 50