Open tanger-code opened 6 months ago
Executing the command below will generate sam and maf files.
pbsim --strategy wgs
--method qshmm
--qshmm data/QSHMM-RSII.model
--depth 20
--genome sample/sample.fasta
--pass-num 10
Please check your command and the output files after execution again.
PBSIM3 generates sam and maf for multi-pass sequencing data. Therefore, that maf can be used as a true set for multi-pass sequencing data. However, since HiFi reads are generated by ccs, that maf cannot be used as a true set for HiFi reads.
OK, thank you. And is there a command to simulate reads with no error and no variant? Maybe sometime I just want to get some reads from a fasta gnome.
PBSIM3 cannot generate error-free reads.
When I use commandpbsim --strategy wgs --method qshmm --qshmm model/QSHMM-RSII.model --depth 5 --genome GCA_chr21.fa --length-mean 17000 length-sd 3000 --length-min 14000 --length-max 20000 --accuracy-mean 0.99 --accuracy-min 0.95 --accuracy-max 1.0
to simulate some reads, the output information is:
What the insertion rate and deletion rate mean? It's mean the error rate or the newly added variation information?
Substitution rate, insertion rate, and deletion rate are their respective percentages in simulated read sequencing. For example, if 3 insertions occur when sequencing a 1000 bp template, the insertion rate will be 0.003.
Hi.
I'm simulating long reads from a genome. But the output is
.maf
file. How can I get theSAM
output? I want to get HIFI reads so I need thesam
file and put it intoccs
software.And if I want to do some simulation experiment such as calling SV based on the simulation reads, can I use the maf file as the truth set?
Any advice would be very helpful to me.