hakyimlab / summary-gwas-imputation

harmonization, liftover, and imputation of summary statistics from GWAS
MIT License
32 stars 20 forks source link

liftover from hg19 -> hg38 #18

Closed soren-rand closed 1 year ago

soren-rand commented 2 years ago

Hi guys

Big fan of your work and has made it super easy to liftover coordinates (38 -> 19). However, I am finding my self in a struggle to do it the other way around. I ofcourse change the chain file in the script, so it looks like the following:

module load tools anaconda3/4.4.0

python3 /path/1_liftover/src/gwas_parsing.py \
-gwas_file /path/liftover/liftover_b37.txt \
-liftover /path/hg19ToHg38.over.chain.gz \
-output_column_map POS position \
-output_column_map A0 non_effect_allele \
-output_column_map A1 effect_allele \
-output_column_map EAF frequency \
-output_column_map BETA effect_size \
-output_column_map PVAL pvalue \
-output_column_map CHR chromosome \
-output_column_map SNP variant_id \
-output_column_map SE standard_error \
-output_column_map N sample_size \
-output_order variant_id chromosome position effect_allele non_effect_allele frequency effect_size standard_error pvalue sample_size ID \
-output /path/liftover/liftover_b38.txt

The script is able to run and finishes in around 2 minutes. Logs look like this:

INFO - Parsing input GWAS
INFO - loaded 10101487 variants
INFO - Performing liftover
INFO - 10101487 variants after liftover
INFO - Saving...
INFO - Finished converting GWAS in 116.5172207057476 seconds

but when i read the output, there is no coordinates for chr or pos either. Just some good ol' NA's. Does the software support conversion from hg19 to 38? If yes, can you figure out where I am the fool

Cheers :)

Soren

hakyim commented 2 years ago

Forwarding to the mailing list for others to comment.

---------- Forwarded message --------- From: soren-rand @.> Date: Thu, Mar 24, 2022 at 9:50 AM Subject: [hakyimlab/summary-gwas-imputation] liftover from hg19 -> hg38 (Issue #18) To: hakyimlab/summary-gwas-imputation < @.> CC: Subscribed @.***>

Hi guys

Big fan of your work and has made it super easy to liftover coordinates (38 -> 19). However, I am finding my self in a struggle to do it the other way around. I ofcourse change the chain file in the script, so it looks like the following:

`module load tools anaconda3/4.4.0

python3 /path/1_liftover/src/gwas_parsing.py -gwas_file /path/liftover/liftover_b37.txt -liftover /path/hg19ToHg38.over.chain.gz -output_column_map POS position -output_column_map A0 non_effect_allele -output_column_map A1 effect_allele -output_column_map EAF frequency -output_column_map BETA effect_size -output_column_map PVAL pvalue -output_column_map CHR chromosome -output_column_map SNP variant_id -output_column_map SE standard_error -output_column_map N sample_size -output_order variant_id chromosome position effect_allele non_effect_allele frequency effect_size standard_error pvalue sample_size ID -output /path/liftover/liftover_b38.txt`

The script is able to run and finishes in around 2 minutes. No error log or anything, but when i read the output, there is no coordinates for chr or pos either. Do you have any idea why?

Cheers :)

Soren

— Reply to this email directly, view it on GitHub https://github.com/hakyimlab/summary-gwas-imputation/issues/18, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAW2ROMTV5C73FKKXA5FJSDVBR6LVANCNFSM5RRNAZ5Q . You are receiving this because you are subscribed to this thread.Message ID: @.***>

soren-rand commented 2 years ago

Thanks Hae!

Nobuyuki-Enzan commented 2 years ago

Dear Ms. Im,

I have the same problem. It is appreciated if you could tell me how to solve it. Or can I see the mailing list?

Best regards,

Nobuyuki

Fnyasimi commented 2 years ago

Hi @Nobuyuki-Enzan can you share a sample dataset of your gwas?

Nobuyuki-Enzan commented 2 years ago

Hi Fnyasimi, Thank you for your quick reply. Attached is the sample GWAS file. gwas_sample.txt

Here is my code.

python3 $REPO/gwas_parsing.py \
 -gwas_file ./gwas_file.txt.gz \
 -output_column_map variant_id variant_id \
 -output_column_map non_effect_allele non_effect_allele \
 -output_column_map effect_allele effect_allele \
 -output_column_map effect_size effect_size \
 -output_column_map pvalue pvalue \
 -output_column_map standard_error standard_error \
 -output_column_map chromosome chromosome \
 -output_column_map position position \
 -output_column_map frequency frequency \
 -output_column_map sample_size sample_size \
 -output_column_map z_score z_score \
 -output_order variant_id panel_variant_id chromosome position effect_allele non_effect_allele frequency pvalue zscore effect_size standard_error sample_size \
 -liftover  ./hg19ToHg38.over.chain.gz \
 -output ../out/gwas_parsed.txt

The script is able to run and finishes without errors. But there are all "NA" in chromosome, position, and panel_variant_id.

Best,

Nobuyuki

Fnyasimi commented 2 years ago

@Nobuyuki-Enzan I have done a quick trouble shooting. The liftover fails because your chromosome column does not contain the prefix chr. Update your chromosome column from 1 to chr1. This will resolve the lift over process.

You won't be able to get the panel_variant_id because you didn't provide the --snp_reference_metadata, more info here. Maybe i future we can by-pass this and generate the panel_variant_id using the gwas data itself.

Nobuyuki-Enzan commented 2 years ago

Hi Fnyasimi, Thank you for your detailed explanation!