GooglingTheCancerGenome / sv-channels

Deep learning-based structural variant filtering method
https://research-software.nl/software/sv-channels
Apache License 2.0
31 stars 5 forks source link

Windown labeling fails on CTX SV type #77

Open arnikz opened 3 years ago

arnikz commented 3 years ago

https://travis-ci.org/github/GooglingTheCancerGenome/sv-channels/jobs/745625564#L5094

It seems to relate to the TRA/CTX mismatch below:

https://github.com/GooglingTheCancerGenome/sv-channels/blob/6174d094d0e2064a02aaea35c6b22515052a325f/scripts/genome_wide/label_windows.py#L66

$ cut -f 7 data/test.bedpe|sort|uniq -c
    982 DEL
   1000 DUP
   1018 INS
   1000 INV
   2000 TRA
arnikz commented 3 years ago

Surprisingly, when executing per SV type using

lsantuari commented 3 years ago

The VCF file htz-sv.vcf generated by SURVIVOR simSV in sv-gen is malformed. I will open an issue in sv-gen.

arnikz commented 3 years ago

@lsantuari: Which of these TRA lines are you referring to? https://github.com/GooglingTheCancerGenome/sv-channels/blob/27a24d578e34469af092d17f334ea70cdb79a548/data/htz-sv.vcf#L4020-L6019

related to #76

lsantuari commented 3 years ago

@lsantuari: Which of these TRA lines are you referring to? https://github.com/GooglingTheCancerGenome/sv-channels/blob/27a24d578e34469af092d17f334ea70cdb79a548/data/htz-sv.vcf#L4020-L6019

related to #76

All of them. Here is how the SVs should be in VCF format according to v4.3.

arnikz commented 3 years ago
lsantuari commented 3 years ago

It is working for CTX (which is the transformed TRA), so it should be enough in our case.

arnikz commented 3 years ago

OK, I'll remove this line. https://github.com/GooglingTheCancerGenome/sv-channels/blob/02e22f060cb4d146d32b61a7e7a29e04f4b2cd39/.travis.yml#L20

arnikz commented 3 years ago

See dev-merge branch (2e2d085696fb17bddfdaa29e9f216a183c7f31ff).