benjjneb / decontam

Simple statistical identification and removal of contaminants in marker-gene and metagenomics sequencing data
https://benjjneb.github.io/decontam/
148 stars 25 forks source link

Here more NAs exist inside the calculated P-value and are classified as FALSE (non-contaminated features). #133

Open YMSen opened 1 year ago

YMSen commented 1 year ago

Hello! We are conducting a study on microorganisms. We have found a large number of NA values for the final P-values when using decontam to identify potential contamination features. Could this be due to the presence of too many zero values in the data table? We have employed a negative control setting to identify potential contamination. Looking forward to your reply! Thank you for your help.

benjjneb commented 1 year ago

We have found a large number of NA values for the final P-values when using decontam to identify potential contamination features.

Any feature present in only 1 sample will get an NA score. My guess is that this explains your results.

busihan commented 1 year ago

Would you consider these features as contaminants because they only present in only 1 sample?

benjjneb commented 1 year ago

Would you consider these features as contaminants because they only present in only 1 sample?

I would consider these as features that do not have strong evidence supporting them as contaminants or non-contaminants.