HadrienG / InSilicoSeq

:rocket: A sequencing simulator
https://insilicoseq.readthedocs.io
MIT License
185 stars 33 forks source link

new feature for handling draft genomes? #71

Closed cheberling closed 5 years ago

cheberling commented 6 years ago

Currently, InSilicoSeq treats a draft genome assembly (having more than one contig) as a separate organism belonging to each contig (fasta record). Most bacterial genome assemblies contain more than one contig. By the looks of it, what I would have to do currently to use InSilicoSeq would be to concatenate all contigs from each genome together into one fasta record per genome, and put all of those into the same fasta file as a 'metagenome'. This requires extra preprocessing on the user's part and requires extra storage space too, and a new 'metagenome' file would have to be constructed for each run of the software. For very similar queries (metagenomes containing many of the same genomes) this would require a lot of extra storage space for redundant information.

Might it be possible to allow a separate command line argument for each genome, and therefore the original genome draft assemblies can be supplied to InSilicoSeq without having to construct new files? The --genomes option might look like this:

--genomes genome1.fasta [genome2.fasta] [genome3.fasta] ... where the first genome is required and the rest are optional.

HadrienG commented 6 years ago

Hi!

Thanks for filing this issue! It is a good idea, although I think by default --genomes should accept an arbitrary number of genomes while treating them as multifasta. iss could have the behavior you are requesting with a --draft flag replacing the --genomes flag (or complementing it if you have a mix of draft and complete genomes)

cheberling commented 6 years ago

Yes, I like your idea too! How difficult would it be to implement something like this?

HadrienG commented 6 years ago

It will require significant changes in how InSilicoSeq handles inpout genomes, but it is doable. I've started working on it in #72

HadrienG commented 5 years ago

new --draft option in version 1.3.0 🎉

cheberling commented 5 years ago

Great, thanks! I look forward to tying it out once I get a chance.