Closed jiyue1214 closed 2 years ago
Hi @jiyue1214,
Not quite addressing your question directly, but it might be useful for you to see how I am doing it from R with the https://github.com/ramiromagno/gwasrapidd package.
Essentially, I get the children terms for "EFO_0004327"
with get_child_efo()
, which is using the https://www.ebi.ac.uk/ols/api/ontologies/efo API, and then I search for GWA studies with get_studies()
.
You will get some warnings about missing terms, just means that in the GWAS Catalog those terms have never showed up, but are nevertheless children terms of "electrocardiography".
library(gwasrapidd)
efo_trait_of_interest <- 'EFO_0004327'
efo_children <- get_child_efo(efo_id = efo_trait_of_interest)
efo_children <- c(efo_trait_of_interest, efo_children[[efo_trait_of_interest]])
eletrocard_studies <- get_studies(efo_id = efo_children)
#> Warning: The request for https://www.ebi.ac.uk/gwas/rest/api/efoTraits/
#> EFO_0600085/studies failed: response code was 404.
#> Warning in gc_request_all(resource_url = resource_url, base_url = base_url, :
#> The request for https://www.ebi.ac.uk/gwas/rest/api/efoTraits/EFO_0600085/
#> studies failed: response code was 404.
#> Warning: The request for https://www.ebi.ac.uk/gwas/rest/api/efoTraits/
#> EFO_0020929/studies failed: response code was 404.
#> Warning in gc_request_all(resource_url = resource_url, base_url = base_url, :
#> The request for https://www.ebi.ac.uk/gwas/rest/api/efoTraits/EFO_0020929/
#> studies failed: response code was 404.
eletrocard_studies
#> An object of class "studies"
#> Slot "studies":
#> # A tibble: 99 × 13
#> study_id reported_trait initial_sample_s… replication_samp… gxe gxg
#> <chr> <chr> <chr> <chr> <lgl> <lgl>
#> 1 GCST000564 Electrocardiogr… 6,543 Indian Asi… 6,243 Indian Asi… FALSE FALSE
#> 2 GCST000344 Electrocardiogr… 1,262 Kosraen in… <NA> FALSE FALSE
#> 3 GCST000111 Electrocardiogr… 1,951 European a… <NA> FALSE FALSE
#> 4 GCST000561 Electrocardiogr… Up to 12,670 Eur… Up to 10,352 Eur… FALSE FALSE
#> 5 GCST002542 Electrocardiogr… 2,994 Japanese a… 6,805 Korean anc… FALSE FALSE
#> 6 GCST90044268 Electrocardiogr… 272 European anc… <NA> FALSE FALSE
#> 7 GCST90044267 Electrocardiogr… 680 European anc… <NA> FALSE FALSE
#> 8 GCST90044269 Electrocardiogr… 3,616 European a… <NA> FALSE FALSE
#> 9 GCST005905 Global electric… 3,057 Black indi… <NA> FALSE FALSE
#> 10 GCST90044266 Electrocardiogr… 62,388 European … <NA> FALSE FALSE
#> # … with 89 more rows, and 7 more variables: snp_count <int>, qualifier <chr>,
#> # imputed <lgl>, pooled <lgl>, study_design_comment <chr>,
#> # full_pvalue_set <lgl>, user_requested <lgl>
#>
#> Slot "genotyping_techs":
#> # A tibble: 100 × 2
#> study_id genotyping_technology
#> <chr> <chr>
#> 1 GCST000564 Genome-wide genotyping array
#> 2 GCST000344 Genome-wide genotyping array
#> 3 GCST000111 Genome-wide genotyping array
#> 4 GCST000561 Genome-wide genotyping array
#> 5 GCST002542 Genome-wide genotyping array
#> 6 GCST90044268 Genome-wide genotyping array
#> 7 GCST90044267 Genome-wide genotyping array
#> 8 GCST90044269 Genome-wide genotyping array
#> 9 GCST005905 Genome-wide genotyping array
#> 10 GCST90044266 Genome-wide genotyping array
#> # … with 90 more rows
#>
#> Slot "platforms":
#> # A tibble: 134 × 2
#> study_id manufacturer
#> <chr> <chr>
#> 1 GCST000564 Illumina
#> 2 GCST000344 Affymetrix
#> 3 GCST000111 Affymetrix
#> 4 GCST000561 Illumina
#> 5 GCST002542 Illumina
#> 6 GCST005905 Affymetrix
#> 7 GCST005905 Illumina
#> 8 GCST010796 Affymetrix
#> 9 GCST011010 Illumina
#> 10 GCST011010 Affymetrix
#> # … with 124 more rows
#>
#> Slot "ancestries":
#> # A tibble: 219 × 4
#> study_id ancestry_id type number_of_individuals
#> <chr> <int> <chr> <int>
#> 1 GCST000564 1 initial 6543
#> 2 GCST000564 2 replication 6243
#> 3 GCST000564 3 replication 5370
#> 4 GCST000344 1 initial 1262
#> 5 GCST000111 1 initial 1951
#> 6 GCST000561 1 initial 12670
#> 7 GCST000561 2 replication 10352
#> 8 GCST002542 1 initial 2994
#> 9 GCST002542 2 replication 6805
#> 10 GCST90044268 1 initial 67136
#> # … with 209 more rows
#>
#> Slot "ancestral_groups":
#> # A tibble: 223 × 3
#> study_id ancestry_id ancestral_group
#> <chr> <int> <chr>
#> 1 GCST000564 1 South Asian
#> 2 GCST000564 2 South Asian
#> 3 GCST000564 3 European
#> 4 GCST000344 1 Oceanian
#> 5 GCST000111 1 European
#> 6 GCST000561 1 European
#> 7 GCST000561 2 European
#> 8 GCST002542 1 East Asian
#> 9 GCST002542 2 East Asian
#> 10 GCST90044268 1 European
#> # … with 213 more rows
#>
#> Slot "countries_of_origin":
#> # A tibble: 166 × 5
#> study_id ancestry_id country_name major_area region
#> <chr> <int> <chr> <chr> <chr>
#> 1 GCST005905 1 <NA> <NA> <NA>
#> 2 GCST005905 2 <NA> <NA> <NA>
#> 3 GCST010796 1 <NA> <NA> <NA>
#> 4 GCST011010 1 <NA> <NA> <NA>
#> 5 GCST011010 2 <NA> <NA> <NA>
#> 6 GCST011010 3 <NA> <NA> <NA>
#> 7 GCST011010 4 <NA> <NA> <NA>
#> 8 GCST003870 1 <NA> <NA> <NA>
#> 9 GCST003870 2 <NA> <NA> <NA>
#> 10 GCST003870 3 <NA> <NA> <NA>
#> # … with 156 more rows
#>
#> Slot "countries_of_recruitment":
#> # A tibble: 378 × 5
#> study_id ancestry_id country_name major_area region
#> <chr> <int> <chr> <chr> <chr>
#> 1 GCST000564 1 U.K. Europe Norther…
#> 2 GCST000564 2 U.K. Europe Norther…
#> 3 GCST000564 3 U.K. Europe Norther…
#> 4 GCST000344 1 Micronesia (Federated States of) Oceania Microne…
#> 5 GCST000561 1 Iceland Europe Norther…
#> 6 GCST000561 2 Iceland Europe Norther…
#> 7 GCST002542 1 Japan Asia Eastern…
#> 8 GCST002542 2 Republic of Korea Asia Eastern…
#> 9 GCST90044268 1 U.K. Europe Norther…
#> 10 GCST90044267 1 U.K. Europe Norther…
#> # … with 368 more rows
#>
#> Slot "publications":
#> # A tibble: 99 × 7
#> study_id pubmed_id publication_date publication title author_fullname
#> <chr> <int> <date> <chr> <chr> <chr>
#> 1 GCST000564 20062061 2010-01-10 Nat Genet Genetic… Chambers JC
#> 2 GCST000344 19389651 2009-02-15 Heart Rhythm Genome-… Smith JG
#> 3 GCST000111 17903306 2007-09-19 BMC Med Gen… Genome-… Newton-Cheh C
#> 4 GCST000561 20062063 2010-01-10 Nat Genet Several… Holm H
#> 5 GCST002542 25055868 2014-07-23 Hum Mol Gen… Genome-… Sano M
#> 6 GCST90044268 34737426 2021-11-04 Nat Genet A gener… Jiang L
#> 7 GCST90044267 34737426 2021-11-04 Nat Genet A gener… Jiang L
#> 8 GCST90044269 34737426 2021-11-04 Nat Genet A gener… Jiang L
#> 9 GCST005905 29622589 2018-04-05 J Am Heart … Genome-… Tereshchenko LG
#> 10 GCST90044266 34737426 2021-11-04 Nat Genet A gener… Jiang L
#> # … with 89 more rows, and 1 more variable: author_orcid <chr>
Hi, ramiromagno
It is super helpful and perfectly solves my problem. Thank you for the example script and explanations!
Yue
Hi @jiyue1214 ,
The Catalog Rest API does not include this feature at the moment, there is no endpoint for retreiving children traits, but its planned to be part of the version 2 of the API which should be released sometimes in 2023. The only way to do it is to query the child terms from OLS as @ramiromagno shared. Thanks @ramiromagno for helping with your solution.
Best Regards
Yomi
I am looking forward to the release of version 2 of the API and thank you for all your hard work on it. What @ramiromagno shares perfectly help me to solve my current problem and thanks again.
Hi, Could I ask for suggestions on how to include child trait data in my search result of an EFO term via API?
Here is an example of how I retrieve studies of an EFO term via API:
The search result using API contains 16 studies. However, the GWAS catalog UI search result for the same EFO term contains 99 studies because the UI search result includes child trait data.
Could I ask for help on how to get API's search results consistent with the UI's?
Cheers, Yue