fritzsedlazeck / SURVIVOR

Toolset for SV simulation, comparison and filtering
MIT License
335 stars 46 forks source link

Simulating SV in PacBio reads #184

Open priyambial123 opened 1 year ago

priyambial123 commented 1 year ago

Hello

I need some suggestions in simulating complex structural variants in Hifi reads file from PacBio https://downloads.pacbcloud.com/public/dataset/HG002-CpG-methylation-202202/m64011_190830_220126.hifi_reads.bam I could understand from the wiki page that SVs can be created in the reference genome. How to simulate these variants in a file that I am interested in.

Thank you

fritzsedlazeck commented 1 year ago

Hi , so just to get this right, you want to simulate SV and use real reads? This is also based on the reference option. There is an option which way you want to simulate. In this case you would need to map the real reads to the so modified reference.

Hope that helps Fritz

priyambial123 commented 1 year ago

I want to create structural variants in the downloaded PacBio data and run it in sv detection pipeline and see if the variant is detected. Is this possible

Thank you

fritzsedlazeck commented 1 year ago

yes, see here https://github.com/fritzsedlazeck/SURVIVOR/wiki#quick-start change the options as described in the text.

priyambial123 commented 1 year ago

Thank you. I have to replace the reference.fasta with the dwonloaded fasta file. Is this right?.

fritzsedlazeck commented 1 year ago

please read the instructions. you need to change the one option 0 to 1 . and then remap your reads to the newly generated fasta file

priyambial123 commented 1 year ago

Thank you. So, I did these steps:

Simulated the structural variations in the reference genome using the parameters given in package:

./SURVIVOR simSV "/SURVIVOR/Debug/human_GRCh38_no_alt_analysis_set.fasta" "/SV_tools/SURVIVOR/Debug/parameter_file" 0.1 0 simulated

Then simulated the reads using simlord:

simlord --read-reference/SV_tools/SURVIVOR/Debug/simulated.fasta -n 10000 myreads

I ran it in my SV detection_workflow and there were no structural variants in vcf file. Is this because of the low number of reads generated?

Thank you