Closed biozzq closed 4 years ago
Hi @biozzq,
CCS data should have very low error rates (< 1%). This can be simply implemented in VISOR LASeR by specifying a proper value for the --accuracy parameter (something like --accuracy 0.99 should be fine). If you have real CCS data handy, you can also derive a true-to-life substitution:insertion:deletion ratio (for example, by using Alfred - see https://github.com/tobiasrausch/alfred) that you can give to the --ratio parameter of the LASeR module. For what is worth, we noticed that for real, non-CSS, pacbio datasets the substitution:insertion:deletion ratio is approximately 15:50:35. Not sure if this ratio is the same for CSS data though.
Best,
Davide
Dear @davidebolo1993
I found that pbsim can produce CCS reads (namely HiFi reads). The default substitution:insertion:deletion ratios for CCS and CLR are CLR: 10:60:30 and CCS:6:21:73. It also contains the quality profile for CCS reads (PBSIM-PacBio-Simulator/data/model_qc_ccs). Maybe VISOR can use these to produce CCS. Thanks.
Best, Zheng zhuqing
Hi @biozzq,
Thanks for the updates. As a temporary workaround, I guess you can substitute the model_qc_clr in VISOR/LASeR with the model for CCS (be sure to rename the CCS model as model_qc_clr and re-run the setup.py), use the substitution:insertion:deletion ratio you found and specificy --accuracy 0.99 to simulate CCS data. I'll be back to work in few days and stably add this feature to VISOR.
Best,
Davide
Hi @biozzq,
Added CCS support in VISOR.
Something like the command below should fit what you need.
VISOR LASeR -g reference.fa -s input.dir -bed simulation.bed -o output.dir -a 0.99 -l 6000 -r 6:21:73 --readstype PB --ccs
Dear @davidebolo1993
That is great. I will try.
Thanks, Zheng zhuqing
Dear @davidebolo1993
The CCS simulation failed with following errors, the tmp file cannot be found.
Traceback (most recent call last):
File "/ZhengZhuQing/00.script/anaconda3/lib/python3.7/site-packages/VISOR-1.0-py3.7.egg/VISOR/LASeR/LASeR.py", line 279, in run
m=Simulate(tag,os.path.abspath(args.genome), args.readstype, args.threads, os.path.abspath(fasta), str(entries[0]), int(entries[1]), int(entries[2]), str(counter), model_qc, args.accuracy, (args.coverage / 100 * float(entries[3]))/len(fastas), allelic, args.length, args.ratio, os.path.abspath(args.output + '/h' + str(folder+1)),folder +1, 1,renamer)
File "/ZhengZhuQing/00.script/anaconda3/lib/python3.7/site-packages/VISOR-1.0-py3.7.egg/VISOR/LASeR/LASeR.py", line 504, in Simulate
os.remove(os.path.abspath(output + '/simref_0001.ref'))
FileNotFoundError: [Errno 2] No such file or directory: '/ZhengZhuQing/03.SV_calling/00.simulation/02.Pacbio_CCS/02.data_del_5x/h1/simref_0001.ref'
Best, Zheng zhuqing
Hi @biozzq,
Sorry, I forgot to include the model_qc_ccs in MANIFEST.in. The issue should be solved by reinstalling VISOR. By the way, for CCS simulations do not exceed 2000 bases in length. The following command should be appropriate for the CCS setting.
VISOR LASeR -g reference.fa -s input.dir -bed simulation.bed -o output.dir -a 0.99 -l 2000 -r 6:21:73 --readstype PB --ccs
Best,
Davide
Hi,
Is —ccs
tag still supported in v1.1.2? I didn’t see that option in the —help
output
Thank you
Dear @davidebolo1993
How can i produce CCS data with VISOR? Thank you.
Best, Zheng zhuqing