Simulating targeted amplicon sequencing

HadrienG / InSilicoSeq

:rocket: A sequencing simulator

https://insilicoseq.readthedocs.io

MIT License

186 stars 33 forks source link

Simulating targeted amplicon sequencing #95

Closed standage closed 5 months ago

standage commented 5 years ago

Greetings! I have a Fasta file containing a few dozen amplicon sequences, and I would like to use InSilicoSeq to simulate targeted sequencing of these amplicons. I had originally used --mode kde --model MiSeq, but this resulted in a read length error. I'm currently trying out other configuration options, but in the mean time do you have any recommendations for simulating amplicon sequencing?

standage commented 5 years ago

In particular, I don't see any way to adjust the lengths of the reads generated.

HadrienG commented 5 years ago

Hi!

InSilicoSeq is aimed at simulating reads from whole metagenomes. For amplicons you could try biogrinder.

While less realistic in quality, grinder has some nice amplicons features, such as adding chimeras.

Concerning the read length it'd be easy to add an argument but I'd prefer have an error profile for MiSeq 250. Do you have a good public dataset in mind? I'm happy to build and add the profile to the defaults.

standage commented 5 years ago

It makes sense that we would want to train an error model using reads of the desired length. I'll let you know if I find a relevant data set.

HadrienG commented 5 months ago

implemented in 2.0.0