Closed ezherman closed 1 month ago
Hey @cheny19 I will be happy if you can take a look at this issue.
Hi NanoSim team, would it be possible to receive an update on this? I have observed deviations from the expected abundances in subsequent simulations too (using different community compositions).
As a workaround I can calculate what the returned abundance is using the organism names in the fastq headers, however it'd be helpful to understand what might be driving these deviations.
Hi @ezherman,
Thanks for following-up. I believe that I have identified where in the code this is erroneously happening - I'm working on a fix and will update you when it is merged to master branch! I'm delaying our next release until we can integrate this - I hope to get that out next week.
That's great to hear, thank you for working on this @lcoombe!
Quick update - I merged my fix to master branch! There's a more detailed explanation of what I found and how I fixed it in the PR here: #232 Hopefully it will fix the issue for you too - on my end, the resulting abundances were much closer to the expected. Of course, there is still variation in the simulated abundances, so they won't be exactly what was requested. Will updated again when I do the next release!
This fix was included in release v3.2.2!
Thank you @lcoombe! I'll give the latest version a try as soon as I can, hopefully later this month.
Hi,
We've been trying to simulate Zymo mock communities using your "even" pre-trained model. We've consistently found Staphylococcus aureus to be seemingly undersimulated, while Cryptococcus neoformans appears to be oversimulated. The former should return with a relative abundance of approx. 12 while the latter should have approx. a relative abundance of 2. I've included instructions below to reproduce the problem. Could you please advise as to what we may be doing wrong? Thanks in advance!
Instructions
Clone the NanoSim repository onto your machine:
Download the Zymo reference genomes using this link.
Unzip the
ZymoBIOMICS.STD.refseq.v2
directory into a newref_metagenome
directory.Unzip and untar the
metagenome_ERR3152364_Even.tar.gz
directory:In the
sample_config_file/metagenome_list_for_training
andsample_config_file/metagenome_list_for_simulation files
, correct the reference genome directory of Cryptococcus neoformans toref_metagenome/ZymoBIOMICS.STD.refseq.v2/Genomes/Cryptococcus_neoformans_draft_genome.fasta
.In the same files, correct the reference genome directory of Saccharomyces cerevisiae to
ref_metagenome/ZymoBIOMICS.STD.refseq.v2/Genomes/Saccharomyces_cerevisiae_draft_genome.fasta
.Create an environment:
After activating the environment, simulate reads:
Which for example can show: