zellerlab / siamcat

R package for Statistical Inference of Associations between Microbial Communities And host phenoType
https://siamcat.embl.de/
51 stars 16 forks source link

check.associations error from working with phyloseq file #37

Closed Hesham999666 closed 1 year ago

Hesham999666 commented 1 year ago

Thanks for making Siamcat tools

I am using phyloseq file format to work with siamcat it work till normalize.features

Here is my phyloseq, code and error

https://www.dropbox.com/s/kc904dm07xo4e62/dust.samples.rds?dl=0

dust.samples = aggregate_taxa(ps.nonc.nocyano, "Genus")

FSr = transform_sample_counts(dust.samples, function(x) x / sum(x))

label <- create.label(meta=sample_data(FSr),label = "Z_score.FVC_gp", case = "ubnormal")

siamcat <- siamcat(phyloseq=FSr, label=label)

sc.obj <- filter.features(siamcat,filter.method = 'abundance',cutoff = 0.001)

sc.obj <- check.associations(siamcat, log.n0=1e-05, feature.type = 'original', alpha = 0.05)

association.plot(sc.obj, sort.by = 'fc', panels = c('fc', 'prevalence')) # , 'auroc'

sc.obj <- normalize.features(sc.obj, norm.method = "log.unit", norm.param = list(log.n0 = 1e-06, n.p = 2,norm.margin = 1))

Error in normalize.features(sc.obj, norm.method = "log.unit", norm.param = list(log.n0 = 1e-06, :

Features have not yet been filtered

sc.obj <- create.data.split(sc.obj, num.folds = 5, num.resample = 2)

sc.obj <- train.model(sc.obj, method = "lasso")

jakob-wirbel commented 1 year ago

Hi @Hesham999666 Thank you for using SIAMCAT! I hope it is useful for you! Also thank you for including your code: That makes it way easier for me to figure out what is the problem.

To your issue: When you execute a SIAMCAT function, it returns a new SIAMCAT object. In your example, you filter the features when calling:

sc.obj <- filter.features(siamcat,filter.method = 'abundance',cutoff = 0.001)

In this case, the filtering will be stored in sc.obj, not in the siamcat object. You can figure out what slots in your SIAMCAT object are filled by just typing sc.obj in your console.

In the next line, you override the sc.obj object with the output of the check.associations functions, therefore the feature filtering results will not be included in this new sc.obj object, since you start from the unfiltered siamcat object again.

This should fix your error:

sc.obj <- siamcat(phyloseq=FSr, label=label)
sc.obj <- filter.features(sc.obj, filter.method = 'abundance',cutoff = 0.001)
sc.obj <- check.associations(sc.obj, log.n0=1e-05, feature.type = 'original', alpha = 0.05) # This change here is important!
association.plot(sc.obj, sort.by = 'fc', panels = c('fc', 'prevalence'))
sc.obj <- normalize.features(sc.obj, norm.method = "log.unit",
    norm.param = list(log.n0 = 1e-06, n.p = 2,norm.margin = 1))

Let me know if this fixes your problem! Cheers, Jakob

Hesham999666 commented 1 year ago

Thanks Jakob it solve my problem. Thanks also for making it working with phyloseq file, a lot of beginner like me use the phyloseq.
It worked fine till the end = model.interpretation.plot