cozygene / FEAST

Fast expectation maximization for microbial source tracking
Other
117 stars 60 forks source link

Unable to replicate demo data #14

Open luke0321li opened 4 years ago

luke0321li commented 4 years ago

I am using metadata_example.txt (by the way, it is missing an id column) and otu_example.txt as inputs to FEAST. The output is clearly different from what's shown in README.md:

"ERR525693_infant gut 2" "ERR525688_Adult gut 1" "ERR525699_Adult gut 2" "ERR525910_Adult gut 3" "ERR525909_Adult skin 1" "ERR525908_Adult skin 2" "ERR525911_Adult skin 3" "ERR525954_Soil 1" "ERR525949_Soil 2" "Unknown" "ERR525698_infant gut 1" 3.29312540069756e-05 3.01775113465031e-15 0.00663104602265952 0.0428963688868257 0.000792554342258054 1.62734493021106e-22 1.3995606837272e-05 0.00566538184886059 0.0739805624462309 0.869987159592318

It seems like in my case FEAST is prioritizing the unknown source. Similarly, when I run FEAST on metadata_example_multi.txt and otu_example_multi.txt, the unknown source always gets an estimated proportion of > 70%

NeginValizadegan commented 3 years ago

Hi @luke0321li Every run will give you a different output because the program works by subsampling/iterations. The results should be similar but not exactly the same. If you want to get the exact same output each time, I recommend setting a seed in R so that you would be able to get same output, graphs, etc. for presentation/publication purposes.

You would have to add a code like set.seed(12) in R before running FEAST. Every time you use the same number inside set.seed(), you should get the same result.