diskin-lab-chop / AutoGVP

17 stars 3 forks source link

modify loading of annotation files #136

Closed rjcorb closed 1 year ago

rjcorb commented 1 year ago

Purpose/implementation Section

What feature is being added or bug is being addressed?

Closes #131. This PR updates code that loads and modifies annotation files (variant summary, multianno, intervar, autopvs1) to reduce compute time and memory.

What was your approach?

What GitHub issue does your pull request address?

130

Directions for reviewers. Tell potential reviewers what kind of feedback you are soliciting.

Which areas should receive a particularly close look?

Please review updated code logic and run autogvp on test_pbta and custom test files.

bash run_autogvp.sh --workflow="cavatica" \
--vcf=input/test_pbta.single.vqsr.filtered.vep_105.vcf \
--filter_criteria='INFO/AF>=0.2 INFO/DP>=15 (gnomad_3_1_1_AF_non_cancer<0.01|gnomad_3_1_1_AF_non_cancer=".")' \
--intervar=input/test_pbta.hg38_multianno.txt.intervar \
--multianno=input/test_pbta.hg38_multianno.txt \
--autopvs1=input/test_pbta.autopvs1.tsv \
--outdir=../results \
--out="test_pbta"
bash run_autogvp.sh --workflow="custom" \
--vcf=input/test_VEP.vcf \
--clinvar=input/clinvar.vcf.gz \
--intervar=input/test_VEP.hg38_multianno.txt.intervar \
--multianno=input/test_VEP.vcf.hg38_multianno.txt \
--autopvs1=input/test_autopvs1.txt \
--outdir=../results \
--out="test_custom"

Is there anything that you want to discuss further?

These changes assume that multianno and intervar files are sorted in the same manner prior to running AutoGVP. Can we communicate that this must be done by specifying in the README?

Documentation Checklist