cozygene / FEAST

Fast expectation maximization for microbial source tracking
Other
115 stars 60 forks source link

Reproducibility of demo data #42

Open CamilaDuitama opened 2 years ago

CamilaDuitama commented 2 years ago

Hi!

I am interested in getting the overall contribution of each environment source to a sink. For this reason I tried out your demo data with the following R script, by running it 100 times and then parsing the results per sink.

#!/usr/bin/env Rscript
args = commandArgs(trailingOnly=TRUE)
#Set directory path
dir_path="./FEAST/ReproducibilityExperiment/"
setwd(dir_path)

library(FEAST)
metadata <- Load_metadata(metadata_path = "FEAST/Data_files/metadata_example_multi.txt")
otus <- Load_CountMatrix(CountMatrix_path = "FEAST/Data_files/otu_example_multi.txt")
FEAST_output <- FEAST(C = otus, metadata = metadata, different_sources_flag = 1, dir_path=dir_path,
                      outfile=paste0("demo",args[1]))

Knowing that FEAST is not a deterministic method, and that there is some expected variability in the results, I wanted to ask you why the results for certain samples vary so much (ex: ERR525698_Env_1). I've attached the results for two of the iterations I calculated (demo90_source_contributions_matrix.txt demo0_source_contributions_matrix.txt).

As you see I always used the same parameters and input data. However, the contribution of Env_2 for the sink mentioned (ERR525698_Env_1) is 0.008 in one iteration and in the other one is 0.798. And this occurs with several other samples.

Is this an expected behaviour?

FEAST version: FEAST_0.1.0

Thank you 👍🏾

Yixiangzhang1996 commented 2 years ago

hi the demo data at plot step something wrong PlotSourceContribution(SinkNames = rownames(FEAST_output)[c(5:8)],SourceNames = colnames(FEAST_output), dir_path = "FEAST-FEAST_beta/Data_files/",mixing_proportions = FEAST_output, Plot_title = "TEST_",Same_sources_flag = 0, N = 4)

Error in mixing_proportions[which(rownames(mixing_proportions) %in% SinkNames), :

mattsnelson commented 1 year ago

I have just tried out with the demo dataset and am getting the same issue in plotting step as @Yixiangzhang1996 . Any updates on this? or suggestions on other code / tutorial datasets to look at to get familiar with using this tool? Thanks ;)


metadata <- Load_metadata(metadata_path = "Data_files/metadata_example_multi.txt")
otus <- Load_CountMatrix(CountMatrix_path = "Data_files/otu_example_multi.txt")

FEAST_output <- FEAST(C = otus, metadata = metadata, different_sources_flag = 1, dir_path = "output", outfile="demo_multi")

graphical output from the output file:

PlotSourceContribution(SinkNames = rownames(FEAST_output)[c(5:8)], SourceNames = colnames(FEAST_output), dir_path = "output", mixing_proportions = FEAST_output, Plottitle = "Test",Same_sources_flag = 0, N = 4)



plotting step produces this error:
`Error in mixing_proportions[which(rownames(mixing_proportions) %in% SinkNames),  : 
  incorrect number of dimensions`