Closed Adoni5 closed 3 years ago
Dear Rory,
thanks for you interest in CAMISIM and I hope I will be able to help you. I am not entirely sure what you mean by "mixed" or "varied abundance" - just that not all genomes have the same abundance or do you have a specific distribution in mind? Do you have target genomes available? Then you will need to create the id_to_genome and metadata files (explained here: https://github.com/CAMI-challenge/CAMISIM/wiki/File-Formats) for these genomes, plug them into the config file (in addition to the read simulator you want to use and how much data you want to produce) and are ready to go using the metagenomesimulation.py script. The output of the read files will always be fastq-files. Please let me know which step is unclear and we will find a solution.
Hi @AlphaSquad ,
Thanks for your quick response! So I have a specific distribution in mind, as I would need to know the abundances of each genome that are in the produced FASTQ. I do have target genomes available. I can see from the example file formats and the wiki what files I'm supposed to make, which is great.
My confusion is how do I pass the metadata files, genomes files and the config files into the docker command?
@AlphaSquad Hey, sorry to chase you but it would be pretty handy if I could get this working soon!
Ah I see where the problem is now.
After you have built the docker container you just run the command you want after the docker run
command, i.e. if your container is called camisim, then
docker run "camisim" metagenome_from_profile.py -p mini.biom
should run a small CAMISIM test.
Inserting synonyms: 190000Traceback (most recent call last):
File "metagenome_from_profile.py", line 11, in <module>
import scripts.get_genomes as GG
File "/usr/local/bin/scripts/get_genomes.py", line 15, in <module>
ncbi = NCBITaxa()
File "/usr/local/lib/python2.7/dist-packages/ete2/ncbi_taxonomy/ncbiquery.py", line 74, in __init__
self.update_taxonomy_database()
File "/usr/local/lib/python2.7/dist-packages/ete2/ncbi_taxonomy/ncbiquery.py", line 101, in update_taxonomy_database
2271000 generating entries...
Uploading to /root/.etetoolkit/taxa.sqlite
update_db(self.dbfile)
File "/usr/local/lib/python2.7/dist-packages/ete2/ncbi_taxonomy/ncbiquery.py", line 659, in update_db
upload_data(dbfile)
File "/usr/local/lib/python2.7/dist-packages/ete2/ncbi_taxonomy/ncbiquery.py", line 698, in upload_data
db.execute("INSERT INTO synonym (taxid, spname) VALUES (?, ?);", (taxid, spname))
sqlite3.IntegrityError: UNIQUE constraint failed: synonym.spname, synonym.taxid
Running the above test gives me this error - is the version of SQLlite pinned?
Uh oh, I just tried it with the docker and got the same error. Unfortunately I am neither an expert on docker nor on SQL. I will try to find out what went wrong and fix this asap. If you already have an idea what could be the problems, I am happy with any comment. Sorry for the inconvenience
I know this is quite old, but the error you report is one of the ete package in python and is not connected to docker or SQL. I've encountered this error while updating some scripts to be compatible with python3 and have added a fix described in the ete repository. This could possibly also solve your problem, please give it a try on the python3 branch!
Hi,
I pulled the latest docker image, but am struggling with the Documentation. What I am trying to do is create a Metagenomics sample of multiple genomes, with a mixed abundance. The genomes themselves aren't important so much as the varied abundance, and it being simulated data from real genomes, and that I receive FASTA/FASTQ at the end.
The docker image is working correctly but I'm not sure how to proceed.
Any help would be appreciated,
Rory