ramiromagno / gwasrapidd

gwasrapidd: an R package to query, download and wrangle GWAS Catalog data
https://rmagno.eu/gwasrapidd/
Other
89 stars 15 forks source link

Response code was 500. #36

Closed james-cranley closed 1 year ago

james-cranley commented 1 year ago

Hi, thank you for the useful package. When I run get_associations() for some SNPs (submitting one SNP at a time, but many in a row, in a for loop), I get an error 500. It seems this is when the request is too large. Do you have any suggestions? Is is possible to return an error file giving me the variant_id which produced the 500 response? Thank you

library('gwasrapidd')
library('tidyverse')
library('glue')

# Manually selected list of EFO terms
efo_traits=list('aortic valve disease', 'heart valve disease','coronary artery disease')

# Import reference, in order to extract EFO_ids
efo_codes<-read_csv('notebooks/suspension/scanpy_clustering/EBI_codes.csv')

# Make a table of SNPs for each trait
list_of_tibbles<-list()
for (i in seq_along(efo_traits)) {
  efo_code<-efo_codes$EFO_ids[efo_codes$`Disease trait`==as.character(efo_traits[i])]
  variants<-get_variants(efo_trait=as.character(efo_traits[i])) #gets variants for the trait
  variants_table<-variants@variants
  variants_table$efo_term<-as.character(efo_traits[i])
    for (j in seq_along(variants_table$variant_id)) {#for each variant, get the association info (pval, beta etc) and adds that to the variants_table
      associations<-get_associations(variant_id=variants_table$variant_id[j])
      associations_table<-associations@associations
      variants_table$pvalue[j]<-associations_table$pvalue[1]
      variants_table$pvalue_description[j]<-associations_table$pvalue_description[1]
      variants_table$beta_number[j]<-associations_table$beta_number[1]
      variants_table$beta_unit[j]<-associations_table$beta_unit[1]
      variants_table$beta_direction[j]<-associations_table$beta_direction[1]
      if (length(associations@associations$association_id) > 1) { # marks if a variant is associated with multiple traits
        variants_table$multiple_associations[j]<-"yes"
        studies<-get_studies(variant_id=variants_table$variant_id[j])
        variants_table$publications_number[j]<-length(studies@publications$study_id)
      }
      else {
        variants_table$multiple_associations[j]<-"no"
        variants_table$publications_number[j]<-1
        }
  list_of_tibbles[[i]]<-variants_table
  write_csv(list_of_tibbles[[i]],glue('/nfs/team205/heart/EBI_SNP_enrichment/traits/{unique(efo_codes$EFO_ids[efo_codes$`EFO term`==as.character(efo_traits[i])])}_{as.character(efo_traits[i])}_EBI_GWAS_SNPs_with_positions.csv'))
  }
}
ramiromagno commented 1 year ago

Hi James,

Thanks for reporting this issue.

If you use the warnings = TRUE and verbose = TRUE, you should be able to spot the problematic SNP.

I can look more closely into your code but I will need a reproducible example. Could you share the contents of the EBI_codes.csv file?

ramiromagno commented 1 year ago

not sure if this is what you're trying to achieve... but here it goes:

library('gwasrapidd')
library('tidyverse')

efo_traits <- c('aortic valve disease', 'heart valve disease','coronary artery disease')
efos <- get_traits(efo_trait = efo_traits)

my_assoc <- get_associations(efo_id = efos@traits$efo_id)
my_assoc@risk_alleles
# A tibble: 2,788 × 7
   association_id locus_id variant_id  risk_allele risk_frequency genome_wide limited_list
   <chr>             <int> <chr>       <chr>                <dbl> <lgl>       <lgl>       
 1 93480070              1 rs10024267  C                       NA FALSE       FALSE       
 2 93480075              1 rs200854727 C                       NA FALSE       FALSE       
 3 6074                  1 rs1333049   C                       NA NA          NA          
 4 6076                  1 rs688034    T                       NA NA          NA          
 5 6077                  1 rs8055236   G                       NA NA          NA          
 6 6075                  1 rs17672135  T                       NA NA          NA          
 7 23207                 1 rs9349379   G                       NA NA          NA          
 8 23208                 1 rs12524865  C                       NA NA          NA          
 9 22536                 1 rs2123536   T                       NA NA          NA          
10 22537                 1 rs1842896   T                       NA NA          NA          
# … with 2,778 more rows
# ℹ Use `print(n = ...)` to see more rows
james-cranley commented 1 year ago

Hi Ramiro thanks for the speedy reply, with warnings on it told me the SNPs. It was only a small number of them. Thank you, the package is very helpful