hakyimlab / MetaXcan

MetaXcan software and manuscript
Other
149 stars 92 forks source link

0 % of model's snps used #176

Open CarlosAmadeo7 opened 1 year ago

CarlosAmadeo7 commented 1 year ago

Hello MetaXcan I am trying to use SPrediXcan.py with the mashr Brain_Cortex training model db file, covariance file txt.gz, and GWAS summary data(chromosomes information compiled in just one text file). However, it always comes up with ' 0 % of model's snps used'. I have been trying a lot but I can not fix it yet. I have checked the format of the GWAS summary file and it is almost exactly the same as that in provided example, with the only difference being that my SNPs column has a different format. The format I used is like" chr1:598941:G:A". I am not sure if this is the source of the problem. Would you help me figure out how to fix it? Thank you very much.

I attached the command I used: ./SPrediXcan.py \

--model_db_path eqtl/mashr/mashr_Brain_Cortex.db \ --covariance eqtl/mashr/mashr_Brain_Cortex.txt.gz \ --gwas_folder GWAS \ --gwas_file_pattern ".*txt" \ --snp_column SNP \ --effect_allele_column Allele1 \ --non_effect_allele_column Allele2 \ --beta_column BETA \ --pvalue_column PVAL \ --output_file results/test.csv

This is the output: INFO - Processing GWAS command line parameters INFO - Building beta for manhattan_plot_input_sorted_model1_tarcchable.txt and eqtl/mashr/mashr_Brain_Cortex.db INFO - Reading input gwas with special handling: GWAS/manhattan_plot_input_sorted_model1_tarcchable.txt INFO - Processing input gwas INFO - Aligning GWAS to models INFO - Trimming output INFO - Successfully parsed input gwas in 25.22468380700002 seconds INFO - Started metaxcan process INFO - Loading model from: eqtl/mashr/mashr_Brain_Cortex.db INFO - Loading covariance data from: eqtl/mashr/mashr_Brain_Cortex.txt.gz INFO - Processing loaded gwas INFO - Started metaxcan association INFO - 0 % of model's snps used INFO - Sucessfully processed metaxcan association in 2.3573985470000025 seconds

wei110110 commented 1 year ago

I met the same problem Do you resolve it?

CarlosAmadeo7 commented 1 year ago

Not yet

Fnyasimi commented 1 year ago

The SNP ID in the GWAS summary stats should be the same to the VarID in the db

DonaldSandoz2000 commented 1 year ago

I've converted it and it's still 0%, does this mean that these SNPs are not suitable for TWAS? But I'm actually still getting results that seem reasonable.

Fnyasimi commented 1 year ago

Can you confirm the genome build of your summary stats is the same as the one in the db

NancyZhong1126 commented 1 year ago

I also met the same problem. I converted the format and also checked the genome build. But it was still "0% of model's snps used".

pablobio commented 10 months ago

Hi all. I met the same problem. I already tried to use the rsID and the VarID in the GWAS data, but both results in the same message "0 % of model's snps used". I double-checked the assembly used to map the markers in the RNA-seq and the GWAS data and both are based on the same assembly. Additionally, during the analysis, I am meeting the following warning message:

INFO - Started metaxcan association WARNING - Issues processing gene ABTB2, skipped WARNING - Issues processing gene ALDH1L2, skipped WARNING - Issues processing gene ANKS1A, skipped WARNING - Issues processing gene ATXN7, skipped WARNING - Issues processing gene BOLA1, skipped WARNING - Issues processing gene C19H3orf18, skipped

Did someone find a solution for this issue?

Thank you very much.

Fnyasimi commented 10 months ago

Could you provide a context of how your input files look like and what command you ran?

AndreaG5 commented 4 months ago

Hi there, same error, genome build is the same, column specification is correct, variant ID are in the same format (I double checked with a merge() in R and SNPs in my GWAS are in the mashr file and correctly merge the two dfs), but still 0% of SNPs used. I tried with all tissues and even with a more significant GWAS (there are a lot more snps identified) but still get the same output. Anyone able to solve it?

hakyim commented 4 months ago

Can you share an example of the summary stats and the command you are using?

On Thu, Jul 25, 2024 at 7:27 AM AndreaG5 @.***> wrote:

Hi there, same error, genome build is the same, column specification is correct, variant ID are in the same format (I double checked with a merge() in R and SNPs in my GWAS are in the mashr file and correctly merge the two dfs), but still 0% of SNPs used. I tried with all tissues and even with a more significant GWAS (there are a lot more snps identified) but still get the same output. Anyone able to solve it?

— Reply to this email directly, view it on GitHub https://github.com/hakyimlab/MetaXcan/issues/176#issuecomment-2250322961, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAW2ROO56JJPONG25U44HWTZOD4NHAVCNFSM6AAAAAA3NNOTA2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENJQGMZDEOJWGE . You are receiving this because you are subscribed to this thread.Message ID: @.***>

Fnyasimi commented 4 months ago

You need to add this argument --keep_non_rsid when using varID to allow the snp ids which are not rsids in this case to be used