Open pontikos opened 9 years ago
I've committed a couple of scripts in annotation to prepare the ExAC custom annotation for VEP: a668095f80ac1556e4deddf9b6c5d5e1aac0f0e4 and 65c5d977c5f407d63c37e9979c3accb35fd9dc8a
I've written a generic liftover script to prepare the custom annotation for VEP. The data needs to be split by chromosome in order for it to work. Note that it doesn't deal with positions moved to a different chromosome or with alternative sequences. These just get dropped by filtering the bed file on the chromosome name.
defa3a540c6238ddf9ca7c0dc366111e2957a811
Sounds good.
Adam
Adam P. Levine On 25 Jan 2015 23:42, "Nikolas Pontikos" notifications@github.com wrote:
I've written a generic liftover script to prepare the custom annotation for VEP. The data needs to be split by chromosome in order for it to work. Note that it doesn't deal with positions moved to a different chromosome or with alternative sequences. These just get dropped by filtering the bed file on the chromosome name.
defa3a5 https://github.com/vplagnol/pipelines/commit/defa3a540c6238ddf9ca7c0dc366111e2957a811
— Reply to this email directly or view it on GitHub https://github.com/vplagnol/pipelines/issues/21#issuecomment-71400308.
Thanks.
I'm moving the annotations to /cluster/project8/IBDAJE/VEP_custom_annotations Initially I had moved them to /goon2/scratch2/vyp-scratch2/annotation but I found writing to that location to be quite unreliable using the SGE (sometimes output files were empty).
I have updated the custom annotations in run_VEP.sh 2d5003481b5ca58dabad50a4693fcb80be770f7f to read from a genome build dependent location
Excellent. So now the post-VEP script just has to be finished. Let me know if you want to discuss what this is going to do.
Adam
Adam P. Levine On 25 Jan 2015 23:52, "Nikolas Pontikos" notifications@github.com wrote:
Thanks.
I'm moving the annotations to /cluster/project8/IBDAJE/VEP_custom_annotations Initially I had moved them to /goon2/scratch2/vyp-scratch2/annotation but I found writing to that location to be quite unreliable using the SGE (sometimes output files were empty).
I have updated the custom annotations in run_VEP.sh 2d50034 https://github.com/vplagnol/pipelines/commit/2d5003481b5ca58dabad50a4693fcb80be770f7f to read from a genome build dependent location
— Reply to this email directly or view it on GitHub https://github.com/vplagnol/pipelines/issues/21#issuecomment-71400702.
Will do. I'm noticing some strange things occurring on the filesystem though: sometimes the output of bgzip is empty. I don't think it's due to a bug in my script but instead probably because of some NFS lag. I'm looking into it.
Ok I think I know what it is: I had a chr${ch}*.vcf.gz
instead of chr${ch}_*.vcf.gz
so there must have been some concurrency issues because the same file would be matched twice. Correcting that and running again.
I've committed the fix 60aae875aa06d62b4f94d45a34da184111fffe3c
See #25 , the liftover of the annoations form 37 to 38 is not trivial because certain regions (especially around the centromeres) have changed significantly.
I wonder how the ESP annotation in b38 works with the VEP?
Adam
Adam P. Levine On 29 Jan 2015 15:08, "Nikolas Pontikos" notifications@github.com wrote:
See #25 https://github.com/vplagnol/pipelines/issues/25 , the liftover of the annoations form 37 to 38 is not trivial because certain regions (especially around the centromeres) have changed significantly.
— Reply to this email directly or view it on GitHub https://github.com/vplagnol/pipelines/issues/21#issuecomment-72039901.
Good point, yes I am not sure how the built-in ESP annotation works compared to the custom one which we liftover. I will check.
Use build 38.
Add ExAC annotation allele frequencies: /cluster/project8/vyp/AdamLevine/annotations/ExAC/0.3/ExAC.r0.3.sites.vep.vcf.gz Have a look at this script to prepare allele frequency from INFO: /cluster/project8/vyp/AdamLevine/annotations/esp/prepare_esp.sh
Add 1kg allele freq.
Add CADD scores