sigven / pcgr

Personal Cancer Genome Reporter (PCGR)
https://sigven.github.io/pcgr
MIT License
251 stars 47 forks source link

Unable to find some VEP generated predictions #204

Closed dattatraymongad closed 1 year ago

dattatraymongad commented 1 year ago

Whenever you encounter an error, please also include:

Additional context I am running PCGR through docker. Apart from PCGR results, I also want some results generated by VEP like Biotype, Impact, SIFT, Polyphen etc. But I am unable to find those in JSON as well as TSV.

Do VEP integrated in PCGR supports these predictions? Or there is way to add these in pipeline?

sigven commented 1 year ago

Hi,

Please take a look at the intermediate annotated VCF file that is generated by the workflow. This is also converted to TSV format, and it will contain an extensive set of variant/gene annotations, including those you refer to I believe (harvested from VEP).

best, Sigve

dattatraymongad commented 1 year ago

As suggested by you, I re-run the PCGR with --debug option (which kept the vep_ready files in VCF format). But I was unable to find the information in generated files.

When I checked main.py file for VEP command, --sift and --polyphen options were missing, I added these options in main.py and re-run the PCGR using --debug option.

But the the printed command of VEP in debug mode does not have --sift and --polyphen options.

Do I need to edit another python script to change VEP command?

pdiakumis commented 1 year ago

Hello @dattatraymongad, Since you mention you're using Docker, you'll probably need to re-build the image locally with your changes which is a bit tricky since we build a conda env in there based on the conda-lock files available at the PCGR repo. It'll be a bit easier if you want to execute PCGR in development mode under conda. You can do a pip install -e . within the cloned git repo (at the top level), with the pcgr conda env activated, then your changes will take effect. Hope that helps!

-- Edit: for the Docker case, if you know about Docker, you can copy the pcgr code into the container and then try adding a RUN pip install -e pcgr so that you emulate the conda solution from above.

sigven commented 1 year ago

@dattatraymongad:

As an added comment to Peter's suggestion:

Note that PCGR gathers its variant predictions from the dbNSFP resource, which is the reason for why PCGR does not use the --sift and --polyphen options when calling VEP. Predictions provided by dbNSFP includes SIFT, but not PolyPhen, as the PCGR databundle is populated solely with data tracks that are freely available to any party without licensing requirements (i.e. commercial or academic)).

As a general advice, if you want to configure VEP with other options than the ones used in PCGR (e.g. SIFT and PolyPhen), I think running VEP as a stand-alone tool would be a better option.

I also attach an example output file that I was referring to above, which contains a lot of annotations from VEP etc. Columns should be documented on the PCGR documentation site.

best, Sigve

TCGA-55-8507-01A_DP_AF.pcgr_acmg.grch37.pass.tsv.gz