ccmbioinfo / crg2

Research pipeline for exploring clinically relevant genomic variants
Apache License 2.0
16 stars 5 forks source link

Incorporate LINSIGHT into annotation pipeline #203

Open Madelinehazel opened 6 months ago

Madelinehazel commented 6 months ago

We would like to add LINSIGHT into the annotation pipeline for the hg38 version of crg2 to predict the fitness consequences of non-coding mutations.

GRCh38 LINSIGHT scores can be downloaded from this link.

You will need to:

  1. Download the BED file of scores to /hpf/largeprojects/ccmbio/nhanafi/c4r/downloads/databases/.
  2. Examine the fields in the file. The score is the fourth column.
  3. Checkout the hg38 branch of crg2: inside your local crg2 directory, type git checkout crg2-hg38
  4. Modify the vcfanno config to add the LINSIGHT score using vcfanno (see other score annotations, e.g. CADD for an example of how to do this)
  5. Setup a test run of the pipeline using the NA12878 BAM specified in the default units.tsv:
    • set the target in dnaseq_slurm_hpf.sh to annotated/coding/vcfanno/NA12878.coding.vep.vcfanno.vcf
    • submit the pipeline job
suzanahmad commented 5 months ago

Pull Request https://github.com/ccmbioinfo/crg2/pull/209