FINNGEN / autoreporting

MIT License
0 stars 1 forks source link

Go through annotations and fix them #184

Closed Lipastomies closed 1 year ago

Lipastomies commented 3 years ago

In #183 Mitja said

agh thats bad... we should be getting our functional annotation from each finngen release annotation file and not from gnomad as we will be missing variants that cant be lifted over to 38 (and maybe also in that file where strand was changed).

Everything else except enrichment in autoreporting should be based on FinnGen annotations.

and these should be coding variants:

transcript_ablation splice_donor_variant stop_gained splice_acceptor_variant frameshift_variant stop_lost start_lost inframe_insertion inframe_deletion missense_variant protein_altering_variant

Here, I'll list where different annotation columns come, and then let's see which should be replaced with what.

Lipastomies commented 3 years ago

Previous release annotations

Columns in report Columns in top report
beta_previous_release lead_beta_previous_release
pval_previous_release lead_pval_previous_release

gnomad functional variants (fin_enriched_genomes_select_columns.gz)

Columns in report Columns in top report
functional_category gnomAD_functional_category, functional_variants_strict,functional_variants_relaxed
enrichment_nfsee gnomAD_enrichment_nfsee
fin.AF gnomAD_fin.AF
fin.AC gnomAD_fin.AC
fin.AN gnomAD_fin.AN
fin.homozygote_count gnomAD_fin.homozygote_count
fet_nfsee.odds_ratio gnomAD_fet_nfsee.odds_ratio
fet_nfsee.p_value gnomAD_fet_nfsee.p_value
nfsee.AC gnomAD_nfsee.AC
nfsee.AN gnomAD_nfsee.AN
nfsee.AF gnomAD_nfsee.AF
nfsee.homozygote_count gnomAD_nfsee.homozygote_count

Finngen annotation file

columns with * means that batchwise columns are taken as well. Those could be dropped out, since they're easily available in the annotation file if one really wants to look at them and they really bloat up the reports. They are already dropped from the pheweb imported data.

Columns in report Columns in top report
most_severe_consequence lead_most_severe_consequence
most_severe_gene lead_most_severe_gene
INFO* -
IMP* -
AF* -

gnomAD genome data annotation

Columns in report Columns in top report
GENOME_AF_fin -
GENOME_AF_nfe -
GENOME_AF_nfe_est -
GENOME_AF_nfe_nwe -
GENOME_AF_nfe_onf -
GENOME_AF_nfe_seu -
GENOME_FI_enrichment_nfe -
GENOME_FI_enrichment_nfe_est lead_enrichment

gnomAD exome data annotation

Columns in report Columns in top report
AF_nfe_bgr -
AF_fin -
AF_nfe -
AF_nfe_est -
AF_nfe_swe -
AF_nfe_nwe -
AF_nfe_onf -
AF_nfe_seu -
FI_enrichment_nfe -
FI_enrichment_nfe_est -
FI_enrichment_nfe_swe -
FI_enrichment_nfe_est_swe -
Fedja commented 3 years ago

everything else here is fine for now except functional_variants_strict and functional_variants_relaxed need to be taken from the list of annotations in this issue and taken from finngen most_severe_consequence and most_severe_gene

Lipastomies commented 3 years ago

I was thinking about the following:

Lipastomies commented 1 year ago

Done at some point