Closed moldach closed 3 years ago
Thanks for the bump, will try to get this done
I think this would only work for async when we can construct URLs ahead of time because the whole point of async requests is to send off a bunch of requests at the same time. Thus, it can't work if you have to do request A to get the information to do request B. Should work if e..g, you know you want 1000 results, and you know the pagination query param names
Yeah it looks like it would need to be a two-step process then since we don't know the range of values to give for page=<n>
where n is unknown:
https://api.targetsafety.info/api/target/alerts/param?
{
"page": 1,
"numberPages": 11,
"targets": [
{
"target_id": 158,
"target_name": "SGLT2",
"actions": [
{
"action_id": 2,
"action_name": "Activators",
"alerts": [
{
"affected_system_id": 10015919,
"affected_system": "eye disorders",
"adverse_event_id": 10015916,
"adverse_event": "eye disorder",
"ref_id": 62872,
"ref_source_type": "Journal",
"ref_title": "Leveraging Human Genetics to Identify Safety Signals Prior to Drug Marketing Approval and Clinical Use",
"ref_citation": "Drug Saf 2020 Feb 28",
"ref_pubmed_id": "32112228",
"ref_link": null,
"ref_date": "2020-02-28",
"alert_detail_id": 662867,
"alert_title": "Phenome-wide association study identifying human gene mutations that could be used for in silico prediction of potential adverse drug effects. Results revealed 8 positive associations correlating gene mutation phenotypes with known safety signals from drugs targeting the protein. These associations were PCSK9 (spina bifida), TNF-alpha (cellulitis and leg abscess), PPARgamma (obesity), estrogen receptor-alpha (hemorrhages), ACE (congenital urinary anomalies), phospholipase A2 (primary hypercoagulable state), GluN2B (symbolic dysfunction) and GluN2A (paroxysmal tachycardia, pulmonary heart disease and sleep disorders). Other safety issues are listed.",
"alert_date": "2020-03-11",
"alert_genetic_study_variant": "gain-of-function mutation",
"alert_type": "Class Alert",
"alert_phase": "Target Discovery",
"alert_onoff_target": "On-Target",
"alert_level_evidence": "Suspected",
"alert_severity": "no",
"alert_species": "human",
"drugs": []
}
]
}
]
}
]
}
Only upon a successful API call (Success 200) would we get n from numberPages
.
So with a bit more effort we could grep numberPages
from each successful API call and then construct these URLs ahead of time.
Closing this issue since I asked them to provide us with a bulk data download instead... 💁🏼
Hoping you can help me try to troubleshoot an error I'm running into.
When I make a standard
curl
request for the API from the command line I see that there are many pages forTNF-alpha
:curl -X GET "https://api.targetsafety.info/api/target/alerts/param?uniprotid=P01375&page=1&token=[MyPrivateKey]" > tnf_alpha.json
We see that the created
.json
file shows that there are 17 pages (and that this request is only showing page 1/17):Since I'm making 100s thousands of API calls with
AsyncQueue()
each of them will have different number of pages.How is it possible to crawl each of these pages using
AsyncQueue()
?Currently only the
1st
ofx
pages are being shown (_note: dropping&page=1
from the url results in a broken API call - the exact page number must be specified).However, currently I see the following note in the documents about Paginator:
If this isn't supported yet it would be greatly welcomed in the near future 😁