Closed mightyphil2000 closed 3 years ago
Hi @mightyphil2000,
Thank you for your question.
Indeed, the REST API service does not provide an endpoint that can directly retrieve associations by reported trait. The only endpoints that allows searching directly by the authors' reported trait are the studies-related endpoints, so we need to get first the studies that are associated with your reported trait of interest, and then search for associations by those studies (e.g., by their studies' ids).
So here is an example of how to do it for the trait 'Blood metabolite levels'
:
library(gwasrapidd)
reported_trait_of_interest <- 'Blood metabolite levels'
studies_of_interest <- get_studies(reported_trait = reported_trait_of_interest)
assoc_of_interest <- get_associations(study_id = studies_of_interest@studies$study_id)
assoc_of_interest
#> An object of class "associations"
#> Slot "associations":
#> # A tibble: 279 x 17
#> association_id pvalue pvalue_description pvalue_mantissa pvalue_exponent
#> <chr> <dbl> <chr> <int> <int>
#> 1 42551 1 e- 19 (X-12244--N-acetylcar… 1 -19
#> 2 42552 7 e- 87 (X-08402) 7 -87
#> 3 42554 3 e- 13 (X-13671) 3 -13
#> 4 42555 1 e- 11 (asparagine) 1 -11
#> 5 42556 2 e- 35 (isovalerylcarnitine) 2 -35
#> 6 42531 6.e-315 (X-11529) 6 -315
#> 7 42532 1 e- 89 (X-11538) 1 -89
#> 8 42558 1 e- 18 (1-palmitoylglyceroph… 1 -18
#> 9 42559 8 e- 12 (1-stearoylglyceropho… 8 -12
#> 10 42560 2 e- 88 (succinylcarnitine) 2 -88
#> # … with 269 more rows, and 12 more variables: multiple_snp_haplotype <lgl>,
#> # snp_interaction <lgl>, snp_type <chr>, standard_error <dbl>, range <chr>,
#> # or_per_copy_number <dbl>, beta_number <dbl>, beta_unit <chr>,
#> # beta_direction <chr>, beta_description <chr>, last_mapping_date <dttm>,
#> # last_update_date <dttm>
#>
#> Slot "loci":
#> # A tibble: 279 x 4
#> association_id locus_id haplotype_snp_count description
#> <chr> <int> <int> <chr>
#> 1 42551 1 NA Single variant
#> 2 42552 1 NA Single variant
#> 3 42554 1 NA Single variant
#> 4 42555 1 NA Single variant
#> 5 42556 1 NA Single variant
#> 6 42531 1 NA Single variant
#> 7 42532 1 NA Single variant
#> 8 42558 1 NA Single variant
#> 9 42559 1 NA Single variant
#> 10 42560 1 NA Single variant
#> # … with 269 more rows
#>
#> Slot "risk_alleles":
#> # A tibble: 279 x 7
#> association_id locus_id variant_id risk_allele risk_frequency genome_wide
#> <chr> <int> <chr> <chr> <dbl> <lgl>
#> 1 42551 1 rs9302065 A NA NA
#> 2 42552 1 rs7157785 T NA NA
#> 3 42554 1 rs2041073 T NA NA
#> 4 42555 1 rs4144027 T NA NA
#> 5 42556 1 rs9635324 A NA NA
#> 6 42531 1 rs4149056 T NA NA
#> 7 42532 1 rs1871395 A NA NA
#> 8 42558 1 rs2070895 A NA NA
#> 9 42559 1 rs588136 T NA NA
#> 10 42560 1 rs1472631 A NA NA
#> # … with 269 more rows, and 1 more variable: limited_list <lgl>
#>
#> Slot "genes":
#> # A tibble: 425 x 3
#> association_id locus_id gene_name
#> <chr> <int> <chr>
#> 1 42551 1 ABCC4
#> 2 42552 1 SGPP1
#> 3 42554 1 HEATR4
#> 4 42555 1 ASPG
#> 5 42556 1 IVD
#> 6 42531 1 SLCO1B1
#> 7 42532 1 SLCO1B1
#> 8 42558 1 LIPC
#> 9 42559 1 LIPC
#> 10 42560 1 LACTB
#> # … with 415 more rows
#>
#> Slot "ensembl_ids":
#> # A tibble: 446 x 4
#> association_id locus_id gene_name ensembl_id
#> <chr> <int> <chr> <chr>
#> 1 42551 1 ABCC4 ENSG00000125257
#> 2 42552 1 SGPP1 ENSG00000126821
#> 3 42552 1 SGPP1 ENSG00000285281
#> 4 42554 1 HEATR4 ENSG00000187105
#> 5 42555 1 ASPG ENSG00000166183
#> 6 42556 1 IVD ENSG00000128928
#> 7 42531 1 SLCO1B1 ENSG00000134538
#> 8 42532 1 SLCO1B1 ENSG00000134538
#> 9 42558 1 LIPC ENSG00000166035
#> 10 42559 1 LIPC ENSG00000166035
#> # … with 436 more rows
#>
#> Slot "entrez_ids":
#> # A tibble: 425 x 4
#> association_id locus_id gene_name entrez_id
#> <chr> <int> <chr> <chr>
#> 1 42551 1 ABCC4 10257
#> 2 42552 1 SGPP1 81537
#> 3 42554 1 HEATR4 399671
#> 4 42555 1 ASPG 374569
#> 5 42556 1 IVD 3712
#> 6 42531 1 SLCO1B1 10599
#> 7 42532 1 SLCO1B1 10599
#> 8 42558 1 LIPC 3990
#> 9 42559 1 LIPC 3990
#> 10 42560 1 LACTB 114294
#> # … with 415 more rows
Let me know if this solves your problem, or if you need further help.
Thanks Ramiro,
I think that solution works if reported trait is the same across associations within a study. A problem arises if reported trait is not consistent with study. For example, say I'm interested in trait X. I identify Study X by searching on trait X. But imagine Study X also investigated trait Y. Therefore get_associations on study ID will retrieve associations for trait X and trait Y but I am not interested in trait Y. I only want associations for trait X. What do you think?
I just checked and actually I think your solution does work because it seems like reported trait is invariant within study ID. I made the mistake of thinking of study as study publication but actually we are referring to the GWAS catalog study IDs (starting with "GC..."), which vary within study publication. Can you confirm that reported trait is invariant within GWAS catalog study ID?
Indeed, each GWAS Catalog study should be only associated with one reported trait. So, if the same publication investigated several traits, then will you likely have several study IDs in the catalog originating from that same publication.
I don't know your application of these searches, but make sure that you prefer the reported_trait
over the EFO trait. As you might know, the reported trait is a trait description that uses original authors' own terms, whereas the EFO traits are a controlled vocabulary defined by the Experimental Factor Ontology (which have been assigned by the GWAS Catalog team).
Hello Philip:
May I close this issue?
Yes thanks!
Obtener Outlook para iOShttps://aka.ms/o0ukef
De: Ramiro Magno @.> Enviado: Friday, June 11, 2021 5:33:04 PM Para: ramiromagno/gwasrapidd @.> Cc: Philip Haycock @.>; Mention @.> Asunto: Re: [ramiromagno/gwasrapidd] get_associations using reported trait (#19)
Hello Philip:
May I close this issue?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/ramiromagno/gwasrapidd/issues/19#issuecomment-859702326, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ACSDAPSS53VIPUYMAJIZT7LTSI3EBANCNFSM46A6AAEA.
Thanks!
Is it possible to search for associations using the reported trait? I checked and it does not seem possible. The
get_associations()
function only allows one to search onefo_trait
andefo_id
but notreported_trait
.