tjiangHIT / cuteSV

Long read based human genomic structural variation detection with cuteSV
MIT License
253 stars 36 forks source link

No SV found when aligning pacbio CLR reads to reference #38

Closed lhui2010 closed 3 years ago

lhui2010 commented 3 years ago

Hi,

I was running cuteSV on pacbio CLR reads on maize, but the output VCF file contains only header. When I checked the output directory by adding --retain_work_dir, there are only several empty sigs files. In the signatures folder, there are many bed files which contained lines starting with "DEL/DUP/INS/INV/TRA". The are no warnings or errors for the whole process. The accesion we sequenced is an inbred line so there should be no heterozygous problem. Is there something wrong so that those information was not passed to the final VCF file? Or is it because that there are many repeat in maize genomes or just cuteSV won't work with plant? I would appreciate if you can give me some suggestions. Many thanks!

Best, Hui


I installed cuteSV via conda. And mapped CLR reads to genome and call snps with command as follows:

minimap2 -t${THREADS} -ax map-pb input/${REF} input/${READS} | samtools sort -@${THREADS} - > output/${SORTEDBAM}
cuteSV --threads ${THREADS} --max_cluster_bias_INS  100 \
    --retain_work_dir \
    --diff_ratio_merging_INS    0.3 \
    --max_cluster_bias_DEL    200 \
    --diff_ratio_merging_DEL    0.5 \
    output/${SORTEDBAM} input/${REF} output/${VCF} output/${VCFDIR}

The output VCF table was like the following:

DEL     chr1   670     30      m54065_170710_194541/48432050/101_9842
DEL     chr1    9756    30      m54065_170714_031530/22086563/0_31613
DEL     chr1    16544   71      m54167_170709_181157/53215835/44719_58683
DEL     chr1    14267   153     m54065_170713_170545/37618114/0_13180

The log file:

2021-05-19 14:03:46,328 [INFO] Running cuteSV --threads 20 --max_cluster_bias_INS 100 --retain_work_dir --diff_ratio_merging_INS 0.3 --max_cluster_bias_DEL 200 --diff_ratio_merging_DEL 0.5 output/AA.pacbio_clr.vs.BB.genome.bam input/BB.genome output/AA.pacbio_clr.vs.BB.genome.bam.vcf output/AA.pacbio_clr.vs.BB.genome.bam.cuteSV
2021-05-19 14:03:46,722 [INFO] The total number of chromsomes: 265
2021-05-19 14:04:12,055 [INFO] Finished 1:180000000-190000000.
2021-05-19 14:04:13,565 [INFO] Finished 1:110000000-120000000.
2021-05-19 14:04:14,647 [INFO] Finished 1:40000000-50000000.
2021-05-19 14:04:15,176 [INFO] Finished 1:20000000-30000000.
2021-05-19 14:04:15,968 [INFO] Finished 1:160000000-170000000.
2021-05-19 14:04:16,573 [INFO] Finished 1:60000000-70000000.
...
2021-05-19 14:10:09,849 [INFO] Rebuilding signatures of structural variants.
2021-05-19 14:10:13,971 [INFO] Clustering structural variants.
2021-05-19 14:10:14,137 [INFO] Writing to your output file.
2021-05-19 14:10:14,138 [INFO] Loading reference genome...
2021-05-19 14:10:27,169 [INFO] Finished in 400.84 seconds.
Meltpinkg commented 3 years ago

Hello!

There may be something wrong with it. May I know the version of cuteSV you use? If you use cuteSV v1.0.11 in your work, you can try to install cuteSV v1.0.10 to see whether it successfully outputs bed files in signatures folder and sigs files. If there is still something wrong, please feel free to tell us.

Hope it will help!

lhui2010 commented 3 years ago

Hi,

Thanks for the reply! I'm using cuteSV 1.0.10 right now. Should I try a different version or different installation method?

Best, Hui

Meltpinkg commented 3 years ago

Hello, @lhui2010

Thanks for your trial. The cuteSV v1.0.10 installed via conda is workable. To solve the problem, may I know the sequencing coverage of your input pacbio CLR reads? And when running cuteSV, did you set any parameter for -s or just use the default?

Thanks a lot for your reply!

lhui2010 commented 3 years ago

Hi,

Thanks for the reply. I installed cuteSV v1.0.11 with conda and it's workable now.

##fileformat=VCFv4.2
##source=cuteSV-1.0.11
##fileDate=2021-05-21 11:10:50 5-CST
...
1       483749  cuteSV.BND.114  N       ]3:121588307]N  .       PASS    PRECISE;SVTYPE=BND;RE=5;RNAMES=NULL     GT:DR:DV:PL:GQ  ./.:.:5:.,.,.:.
1       483757  cuteSV.BND.115  N       ]3:121581369]N  .       PASS    PRECISE;SVTYPE=BND;RE=5;RNAMES=NULL     GT:DR:DV:PL:GQ  ./.:.:5:.,.,.:.
1       488964  cuteSV.BND.116  N       ]3:199372523]N  .       PASS    PRECISE;SVTYPE=BND;RE=13;RNAMES=NULL    GT:DR:DV:PL:GQ  ./.:.:13:.,.,.:.
1       494588  cuteSV.DEL.36   tattgttggggacttgttctcaagtgctatgagttaagaacaaggcaacacagagaatgttaaacgataaagtccttcgtcctctgaagcattatttcccttaggatataacgatctt
1       503080  cuteSV.INS.57   T       TTGATGGGCCCTGGTCGTGTGTATTCAACCCATGGGGATTGAGAGGGATTGAATCCCCCCTACAATCAAAATCCCCCTCAAATCACTCAATCCCCCTCACTCCCATGGAT
1       503552  cuteSV.BND.117  N       [3:215855326[N  .       PASS    PRECISE;SVTYPE=BND;RE=31;RNAMES=NULL    GT:DR:DV:PL:GQ  ./.:.:31:.,.,.:.
1       503556  cuteSV.BND.118  N       N]3:215864607]  .       PASS    PRECISE;SVTYPE=BND;RE=50;RNAMES=NULL    GT:DR:DV:PL:GQ  ./.:.:50:.,.,.:.
1       508415  cuteSV.BND.119  N       N[4:42754802[   .       PASS    PRECISE;SVTYPE=BND;RE=14;RNAMES=NULL    GT:DR:DV:PL:GQ  ./.:.:14:.,.,.:.
1       508416  cuteSV.INS.58   g       gAAAGCTTGTTGGGGCCATTGTTCTCACATGCTTCGACTTAAGAACAAGGCAACATAACACGTGTTAAGGCCGTTCAAGTCCTTCGTCCTTCGAAGCATGATGTCCTTCG
1       509719  cuteSV.BND.120  N       ]2:86898861]N   .       PASS    PRECISE;SVTYPE=BND;RE=9;RNAMES=NULL     GT:DR:DV:PL:GQ  ./.:.:9:.,.,.:.
1       509720  cuteSV.INS.59   a       aGATCCTACACCAAATCATACACTTCCGAGGCTTTATAACTCCTACAAAACTTCCTACCAGGCTTTTATAAAACAAACCGTAGGAAATAATAACTAACTTCCTAGAGGCA
1       509721  cuteSV.BND.121  N       [3:182931436[N  .       PASS    PRECISE;SVTYPE=BND;RE=9;RNAMES=NULL     GT:DR:DV:PL:GQ  ./.:.:9:.,.,.:.
1       511661  cuteSV.BND.122  N       [4:95921360[N   .       PASS    PRECISE;SVTYPE=BND;RE=50;RNAMES=NULL    GT:DR:DV:PL:GQ  ./.:.:50:.,.,.:.
1       511664  cuteSV.BND.123  N       N]4:95928694]   .       PASS    PRECISE;SVTYPE=BND;RE=50;RNAMES=NULL    GT:DR:DV:PL:GQ  ./.:.:50:.,.,.:.

The coverage for CLR reads was ~60×. I just use the default parameter for CLR reads and did not specify -s.

Thanks for the help! Hui