ndierckx / Sim-it

Versatile simulator for structural variance and Nanopore/PacBio sequencing reads
Apache License 2.0
21 stars 1 forks source link

Problem simulating IDUP events #21

Open xiweiwu opened 1 year ago

xiweiwu commented 1 year ago

I tried to simulate 1000 Inverted Duplication (IDUP) events with PacBio reads using hg38 genome, but the reads do not seem to contain any such events. I aligned the reads with pbmm2, and can't see any abnormality at the simulated IDUP sites. I also simulated the same number of insertion, deletion, duplication, inversion, and CSUB events. These other SV events work without any issues. Please help.

ndierckx commented 11 months ago

Hi, Sorry for the late response, been away and a lot of work.. Can you send the config file you used then I will run it myself to see if I have a similar problem? Did the IDUP events pop up in the VCF file?

xiweiwu commented 11 months ago

I do see IDUP events in the vcf file. However, the reads do not seem to contain any IDUP after alignment (no softclip or insertion found on IGV). Here is my config file.

Project:

Project name = hg38_CSV Reference = /net/nfs-irwrsrchnas01/labs/xwu/genome/hg38/hg38.fa Replace ambiguous nts(N) = Max threads = 8

Structural variation:

VCF input = Foreign sequences =

Deletions = 1000 Length (bp) = 50-2000

Insertions = 1000 Length (bp) = 50-2000

Tandem duplications = 1000 Length (bp) = 50-2000 Copies = 1-5

Inversions = 1000 Length (bp) = 50-2000

Complex substitutions = 1000 Length (bp) = 50-2000

Inverted duplications = 1000 Length (bp) = 50-2000

Heterozygosity = 0.6

Long Read simulation:

Sequencing depth = 10 Median length = 15000 Length range = 2500-35000 Accuracy = 98 Error profile = error_profile_PB_Sequel_CCS_hifi.txt