HadrienG / InSilicoSeq

:rocket: A sequencing simulator
https://insilicoseq.readthedocs.io
MIT License
176 stars 32 forks source link

Abundance file shows more genomes than are used #262

Open Naturalist1986 opened 1 month ago

Naturalist1986 commented 1 month ago

Hi,

I'm using this command to generate metagenomes from a set of genome fasta files:

iss generate --draft (list of *.fna files) --output --model nextseq --coverage uniform -p 128 -n 50M

Although the abundance file shows it uses 10 genomes for example, when I count the reads in the fastqs I only see 5-6 genomes.

HadrienG commented 2 weeks ago

Hi!

Are all your contig names unique across draft genomes? InSilicoSeq currently uses the contig names as read headers.

Naturalist1986 commented 2 weeks ago

Hi,

Yes, these are downloaded from NCBI Refseq. A portion of the genome contigs is missing from the generated fastqs, even though it is listed in the abundance file.