Closed alexander-e-f-smith closed 4 years ago
Hi Alexander! Can you provide the BAM file with this reads? There were a lot of changes till 1.5.7., especially in realignment, so it will be very helpful!
Hi Polina Yes I can. Have you got an FTP site to deposit these or would you prefer a dropbox link for example? I will also need to send target and amplicon bed files (running in amplicon mode but tested without). I will also send vcfs with and without realignment to example variant and position etc. And also how we run this (options). FYI, I can get variant caller to find the variant at a different position (alternative realignment match presumably) but only for about 20% of the variant supporting (VD) reads (you'll see in vcf). Also, I can get the variant caller to see slightly more of the variant supporting reeds if I artificially use incorrect amplicon bed coordinate (increasing region size), although in doing this I loose most of the read across that region (increasing the -a overlap range in this case brings all reads back but then looses variant again).
Dr Alexander Smith Dept Haematology King's College NHS trust Denmark Hill London SE5 9RS
From: Polina Bevad notifications@github.com Sent: 29 April 2020 08:19 To: AstraZeneca-NGS/VarDictJava VarDictJava@noreply.github.com Cc: Smith, Alexander alexander.e.smith@kcl.ac.uk; Author author@noreply.github.com Subject: Re: [AstraZeneca-NGS/VarDictJava] realignment incorrectly removing validated tandem duplication insertion (#288)
Hi Alexander! Can you provide the BAM file with this reads? There were a lot of changes till 1.5.7., especially in realignment, so it will be very helpful!
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FAstraZeneca-NGS%2FVarDictJava%2Fissues%2F288%23issuecomment-621031853&data=01%7C01%7Calexander.e.smith%40kcl.ac.uk%7Ca2950a6d47204ba2f83008d7ec0daa78%7C8370cf1416f34c16b83c724071654356%7C0&sdata=zmCtXE5E1%2BDw6mGqVE5OydpgiIyHI4tiuPWbX2p5BcE%3D&reserved=0, or unsubscribehttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAG3NR5QORA52LXDWRBBD5YLRO7IIHANCNFSM4MSCZ2SQ&data=01%7C01%7Calexander.e.smith%40kcl.ac.uk%7Ca2950a6d47204ba2f83008d7ec0daa78%7C8370cf1416f34c16b83c724071654356%7C0&sdata=IVx3IaVgJR5svgu%2BgMManVNGvI%2FqnJTpZLs3u3V6gRg%3D&reserved=0.
Great, you can add link to dropbox here, thanks! You can add only slices of data that cover the problematic region, no need to add full-size files. I plan to spend some time to fix last VarDictJava issues next week, so it will be great to look at this one too.
Hi again Here are the linked files are running parameters (no realignment in this case). They are not very big so have sent the complete bam. The position we are looking around is on chr9:5070017 (33bp insertion).
https://www.dropbox.com/s/qm5n376smmr7tw7/mpn_amplicon.bed?dl=0
https://www.dropbox.com/s/jw9bf3yyp8dcqi2/mpn_target.bed?dl=0
https://www.dropbox.com/s/djo9tssg0gyx8zj/not_realigned.vcf?dl=0
https://www.dropbox.com/s/8rlh2tb82wai72c/realigned.vcf?dl=0
https://www.dropbox.com/s/gq3mmlq0aaefosl/sample_indel_jak2_exon12.bam?dl=0
/usr/local/pipeline/vardictjava/bin/VarDict -G /home/vagrant/work/genome/human_g1k_v37_decoy.fasta -f 0.01 -N 94152 -b /home/vagrant/work/analysis/MPN_94152/94152/default/align/4f2fbe9b.sorted.bam -a 25:0.80 -W 20 -w 150 -I 100 -P 0 -k 0 -z -c 1 -S 2 -E 3 -g 4 -VS STRICT -th 8 /home/vagrant/snappy/snappy/tas/mpn1/primer_genomic_coordinate_amplicons.bed | /usr/local/pipeline/vardictjava/VarDict/teststrandbias.R | /usr/local/pipeline/vardictjava/VarDict/var2vcf_valid.pl -N 94152 -E -A -f 0.01 -P 0 -p 1 > /home/vagrant/work/analysis/MPN_94152/94152/default/variants/4f2fbe9b.VD.vcf
Dr Alexander Smith Dept Haematology King's College NHS trust Denmark Hill London SE5 9RS
From: Polina Bevad notifications@github.com Sent: 29 April 2020 16:32 To: AstraZeneca-NGS/VarDictJava VarDictJava@noreply.github.com Cc: Smith, Alexander alexander.e.smith@kcl.ac.uk; Author author@noreply.github.com Subject: Re: [AstraZeneca-NGS/VarDictJava] realignment incorrectly removing validated tandem duplication insertion (#288)
Great, you can add link to dropbox here, thanks! You can add only slices of data that cover the problematic region, no need to add full-size files. I plan to spend some time to fix last VarDictJava issues next week, so it will be great to look at this one too.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FAstraZeneca-NGS%2FVarDictJava%2Fissues%2F288%23issuecomment-621288410&data=01%7C01%7Calexander.e.smith%40kcl.ac.uk%7C572c494b11444be44ea908d7ec527d35%7C8370cf1416f34c16b83c724071654356%7C0&sdata=GRwlggTzi0jicbXFxC4mjBnUGvNL8a92qazm7pYHAig%3D&reserved=0, or unsubscribehttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAG3NR5TI3BFUNU5LFNSLO2LRPBB7XANCNFSM4MSCZ2SQ&data=01%7C01%7Calexander.e.smith%40kcl.ac.uk%7C572c494b11444be44ea908d7ec527d35%7C8370cf1416f34c16b83c724071654356%7C0&sdata=fqXpBdfJy0z%2FOcW1g0%2BuPgm2jWMshsO5bTL4TAlsfys%3D&reserved=0.
Hi Alexander!
I found where was the issue: it is adjustment method for reference counts and qualities in realignment, mapping quality was decreased too much for this insertion. I've added PR to fix this! You can try it on your data by building VarDictJava with branch from PR, but we'll include this in next release also.
Thank you for finding this!
Hi Polina Brilliant, thank you so much, that is great. I'll give it a go in next few days. When do you think the next tagged release version will be released? Best Alex
Dr Alexander Smith Dept Haematology King's College NHS trust Denmark Hill London SE5 9RS
From: Polina Bevad notifications@github.com Sent: 05 May 2020 13:13 To: AstraZeneca-NGS/VarDictJava VarDictJava@noreply.github.com Cc: Smith, Alexander alexander.e.smith@kcl.ac.uk; Author author@noreply.github.com Subject: Re: [AstraZeneca-NGS/VarDictJava] realignment incorrectly removing validated tandem duplication insertion (#288)
Hi Alexander!
I found where was the issue: it is adjustment method for reference counts and qualities in realignment, mapping quality was decreased too much for this insertion. I've added PR to fix this! You can try it on your data by building VarDictJava with branch from PR, but we'll include this in next release also.
Thank you for finding this!
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FAstraZeneca-NGS%2FVarDictJava%2Fissues%2F288%23issuecomment-624018881&data=01%7C01%7Calexander.e.smith%40kcl.ac.uk%7C07bac79468874ac7b45508d7f0edc751%7C8370cf1416f34c16b83c724071654356%7C0&sdata=4CQ4Zp%2Fe1AZVAbM4%2FQxVaBUbEIhPeQ2HBlVklWQV4q4%3D&reserved=0, or unsubscribehttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAG3NR5TYQT3IAMBLEIDNRYLRP77IBANCNFSM4MSCZ2SQ&data=01%7C01%7Calexander.e.smith%40kcl.ac.uk%7C07bac79468874ac7b45508d7f0edc751%7C8370cf1416f34c16b83c724071654356%7C0&sdata=1iLHK8o8lUXpzWNUK0II1g7A3TA9bK1iodP2iHYvkSo%3D&reserved=0.
Hi again Polina I can't seem to see a commit on github corresponding to this change yet. Could you point me in the right direction please> BEst ALex
Dr Alexander Smith Dept Haematology King's College NHS trust Denmark Hill London SE5 9RS
From: Smith, Alexander alexander.e.smith@kcl.ac.uk Sent: 05 May 2020 14:59 To: AstraZeneca-NGS/VarDictJava VarDictJava@noreply.github.com; AstraZeneca-NGS/VarDictJava reply@reply.github.com Cc: Author author@noreply.github.com Subject: Re: [AstraZeneca-NGS/VarDictJava] realignment incorrectly removing validated tandem duplication insertion (#288)
Hi Polina Brilliant, thank you so much, that is great. I'll give it a go in next few days. When do you think the next tagged release version will be released? Best Alex
Dr Alexander Smith Dept Haematology King's College NHS trust Denmark Hill London SE5 9RS
From: Polina Bevad notifications@github.com Sent: 05 May 2020 13:13 To: AstraZeneca-NGS/VarDictJava VarDictJava@noreply.github.com Cc: Smith, Alexander alexander.e.smith@kcl.ac.uk; Author author@noreply.github.com Subject: Re: [AstraZeneca-NGS/VarDictJava] realignment incorrectly removing validated tandem duplication insertion (#288)
Hi Alexander!
I found where was the issue: it is adjustment method for reference counts and qualities in realignment, mapping quality was decreased too much for this insertion. I've added PR to fix this! You can try it on your data by building VarDictJava with branch from PR, but we'll include this in next release also.
Thank you for finding this!
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FAstraZeneca-NGS%2FVarDictJava%2Fissues%2F288%23issuecomment-624018881&data=01%7C01%7Calexander.e.smith%40kcl.ac.uk%7C07bac79468874ac7b45508d7f0edc751%7C8370cf1416f34c16b83c724071654356%7C0&sdata=4CQ4Zp%2Fe1AZVAbM4%2FQxVaBUbEIhPeQ2HBlVklWQV4q4%3D&reserved=0, or unsubscribehttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAG3NR5TYQT3IAMBLEIDNRYLRP77IBANCNFSM4MSCZ2SQ&data=01%7C01%7Calexander.e.smith%40kcl.ac.uk%7C07bac79468874ac7b45508d7f0edc751%7C8370cf1416f34c16b83c724071654356%7C0&sdata=1iLHK8o8lUXpzWNUK0II1g7A3TA9bK1iodP2iHYvkSo%3D&reserved=0.
Hi Alexander, PR is linked to my branch: https://github.com/PolinaBevad/VarDictJava/commits/issue_285_and_288 You can clone it and try! I guess we will create release in May, I need to check one more issue (with hard-clips) before this will happen.
Hi again Polina
I have tested with issue_285_and_288 branch and the indel is now seen, so that's great thanks. I did notice that this version now seems to be pushing out vcf4.1 again (version 1.7 and later are 4.3). Does that mean this branch is actually derived from before v6 of vardict (at this point the indel was called anyway)? It's hard to tell as this is your personal git repo rather than astraZ one.
Thanks again
Alex
From: Polina Bevad notifications@github.com Sent: 05 May 2020 15:18 To: AstraZeneca-NGS/VarDictJava VarDictJava@noreply.github.com Cc: Smith, Alexander alexander.e.smith@kcl.ac.uk; Author author@noreply.github.com Subject: Re: [AstraZeneca-NGS/VarDictJava] realignment incorrectly removing validated tandem duplication insertion (#288)
Hi Alexander, PR is linked to my branch: https://github.com/PolinaBevad/VarDictJava/commits/issue_285_and_288https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FPolinaBevad%2FVarDictJava%2Fcommits%2Fissue_285_and_288&data=01%7C01%7Calexander.e.smith%40kcl.ac.uk%7C204940ebcf034726ce2908d7f0ff21a7%7C8370cf1416f34c16b83c724071654356%7C0&sdata=p5ZQeHQRUlCDMCVAQtCYYeKXohG0Teps2sP1vXvMYik%3D&reserved=0 You can clone it and try! I guess we will create release in May, I need to check one more issue (with hard-clips) before this will happen.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FAstraZeneca-NGS%2FVarDictJava%2Fissues%2F288%23issuecomment-624083244&data=01%7C01%7Calexander.e.smith%40kcl.ac.uk%7C204940ebcf034726ce2908d7f0ff21a7%7C8370cf1416f34c16b83c724071654356%7C0&sdata=FuZvrk0zM0XVkYoLH58nAf2KtEY0eB%2FjbQuIKWuHh%2B4%3D&reserved=0, or unsubscribehttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAG3NR5R2OJU23BDONMDBGPTRQANZXANCNFSM4MSCZ2SQ&data=01%7C01%7Calexander.e.smith%40kcl.ac.uk%7C204940ebcf034726ce2908d7f0ff21a7%7C8370cf1416f34c16b83c724071654356%7C0&sdata=vvt2chQPHrpmaSkZjCP6NPG0ik1bJ28kKeAhdFzk5ew%3D&reserved=0.
Hi Alexander,
Yes, sorry, typically I do not update my branch with VarDict perl changes, as it is part of release process - we creates VCF files from raw vardict results using Perl files from VarDict perl repo! So they are linked to old version still.
You can update your VarDict perl files directly from master, if needed: https://github.com/AstraZeneca-NGS/VarDict (we use files var2vcf_valid.pl
, var2vcf_paired.pl
, teststrandbias.R
and testsomatic.R
from here).
Anyway I'll update my branch tomorrow, so it will link to master of VarDict perl and then you can pull it!
Hi Alexander, I've updated VarDict perl version in my branch to the latest!
Hi Alexander, I'll close this issue as the problem was resolved! Thank you!
Hi Polina Thanks for all your help. Would you be able to say when a tagged release of vardict may be available with these latest changes/bugfixes included? Best Alex
From: Polina Bevad notifications@github.com Sent: 08 May 2020 09:03 To: AstraZeneca-NGS/VarDictJava VarDictJava@noreply.github.com Cc: Smith, Alexander alexander.e.smith@kcl.ac.uk; Author author@noreply.github.com Subject: Re: [AstraZeneca-NGS/VarDictJava] realignment incorrectly removing validated tandem duplication insertion (#288)
Hi Alexander, I've updated VarDict perl version in my branch to the latest!
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FAstraZeneca-NGS%2FVarDictJava%2Fissues%2F288%23issuecomment-625694582&data=01%7C01%7Calexander.e.smith%40kcl.ac.uk%7Cbf15e5cb1a694662381608d7f3264f08%7C8370cf1416f34c16b83c724071654356%7C0&sdata=P3j75TRA4PUlIZBQeJjWisHh3Uf%2BMRtuVa%2BA2EjjIZE%3D&reserved=0, or unsubscribehttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAG3NR5RN56FVPDRLNRI64LDRQO4FNANCNFSM4MSCZ2SQ&data=01%7C01%7Calexander.e.smith%40kcl.ac.uk%7Cbf15e5cb1a694662381608d7f3264f08%7C8370cf1416f34c16b83c724071654356%7C0&sdata=sZf0Y3qD1nesufBpJUH%2B8tEC3W0BxumEbHng9kfdGgc%3D&reserved=0.
I have tested the master java version and tagged version 1.7.0 and they both remove a seemingly valid tandem duplication insertion variant 33bp. This can be rectified by preventing realignment (-k 0). The older vardict version (1.5.7) calls this without removing realignment. Attached show image of a grep of the fastq in question to demonstrate the validity of the variant. This is also seen in IGV for the BWA aligned bam.
. The position is at chr9:5070017... jak2_realigment_vardict_issue.pdf