mroosmalen / nanosv

SV caller for nanopore data
MIT License
89 stars 22 forks source link

SV type inference and coverage depth #33

Closed tgong1 closed 6 years ago

tgong1 commented 6 years ago

Hi, I have several questions and much appreciated if you can give me some ideas.

  1. I didn't see inversion as SVTYPE in header of vcf file. How can I infer the inversion event?
  2. I'd like to confirm that if the bed file is not provided and depth_support=False, no deletions and duplications can be inferred in VCF and SVTYPE will be BND.
  3. I'd like to modify /nanosv/scripts/create_random_position_bed.py to generate the bed file for my reference genome (hg38). What I need to do are to modify the length of genome from 1 to 23 and change the simple_repeats_file and gaps_file. Are those right?

Thank you very much!

mroosmalen commented 6 years ago

Hi

Answer1: Inversion are reported as BND events. Those are reported on the same chromsome with one of the following ALT notation:

Answer2: You're right if the depth_support=False, deletions will reported as BND with the following notation:

Answer3: Yes indeed you only need to change the chromosome lengths and the two bed files

tgong1 commented 6 years ago

Hi, Thank you for the answers for DEL/DUP/INV events inference. I still have two questions here:

  1. I'd like to confirm that BND events reported on different chromosomes are then inter-chromosomal translocations.
  2. I modified create_random_position_bed.py as following: genome = { 1: 248956422, 2: 242193529, 3: 198295559, 4: 190214555, 5: 181538259, 6: 170805979, 7: 159345973, 8: 145138636, 9: 138394717, 10: 133797422, 11: 135086622, 12: 133275309, 13: 114364328, 14: 107043718, 15: 101991189, 16: 90338345, 17: 83257441, 18: 80373285, 19: 58617616, 20: 64444167, 21: 46709983, 22: 50818468, 23: 156040895, 24: 57227415 } simple_repeats_file is from http://hgdownload.cse.ucsc.edu/goldenPath/hg38/database/simpleRepeat.txt.gz and gaps_file is from http://hgdownload.cse.ucsc.edu/goldenPath/hg38/database/gap.txt.gz

However, I still got the error message: Can't calculate coverage distribution. The bed file may be inappropriate for your bam file.

Is that due to I have chrM in the header of my bam file? If it is the reason, are there anything I can do without modifying the bam file? @SQ SN:chr1 LN:248956422 @SQ SN:chr2 LN:242193529 @SQ SN:chr3 LN:198295559 @SQ SN:chr4 LN:190214555 @SQ SN:chr5 LN:181538259 @SQ SN:chr6 LN:170805979 @SQ SN:chr7 LN:159345973 @SQ SN:chr8 LN:145138636 @SQ SN:chr9 LN:138394717 @SQ SN:chr10 LN:133797422 @SQ SN:chr11 LN:135086622 @SQ SN:chr12 LN:133275309 @SQ SN:chr13 LN:114364328 @SQ SN:chr14 LN:107043718 @SQ SN:chr15 LN:101991189 @SQ SN:chr16 LN:90338345 @SQ SN:chr17 LN:83257441 @SQ SN:chr18 LN:80373285 @SQ SN:chr19 LN:58617616 @SQ SN:chr20 LN:64444167 @SQ SN:chr21 LN:46709983 @SQ SN:chr22 LN:50818468 @SQ SN:chrX LN:156040895 @SQ SN:chrY LN:57227415 @SQ SN:chrM LN:16569

mroosmalen commented 6 years ago

Yes indeed BND events between two different chromosomes are inter-chromosomal translocations.

Your bam file has the chr notation in front of the chromosome, and I think your bed don't have this notation. If you create the bed with the chr notation than it should work.

tgong1 commented 6 years ago

Got it! Thank you very much for the help.