rki-mf1 / amplisim

Plain simple amplicon sequence simulator for in-silico genomic sequencing assays
Apache License 2.0
2 stars 0 forks source link

Implement CLI parameters for more precise variant likelihoods of MM/INS/DEL #11

Open Krannich479 opened 1 year ago

Krannich479 commented 1 year ago

Currently, a random variable decides which modification is introduced at a base position with hardcoded likelihoods of MM=0.8, INS=0.1 and DEL=0.1. There should be a better way to control this via the CLI.

Krannich479 commented 12 months ago

At the same time, introduce longer runs of INDELs, e.g. using a negative exponential length distribution.

Krannich479 commented 11 months ago

Reminder that this should be improved. Currently, the likelihood that a MM/INS/DEL is introduced is coupled to an initial overall likelihood of introducing an error at all. This approach is impractical, slow (unnecessary random floats) and gives a false intuition of the individual mutation likelihoods. The overall likelihood of introducing a mutation should be removed. Also, the option to learn error likelihoods from real data was suggested (thanks to @matthuska )