I would like to know what is the best way to reshape the output of https://clinicaltrials.gov/api/v2 to be used with Trials2Vec
I wrote a quick dirty function (see code below) to get a result to the demo data but I am not sure if the logic applied is 100% correct.
def getClinicalTrialStudy(nct_id:str)->dict:
import requests
import json
url = f"https://clinicaltrials.gov/api/v2/studies/{nct_id}"
headers = {"accept": "text/csv"}
response = requests.get(url, headers=headers)
if response.status_code == 200:
return json.loads(response.text)
else:
print("Request failed with status code:", response.status_code)
def ct_dict2pd(study:dict) ->pd.Series():
"""Reformat the outcome of Clinical Trials API to appropriate format for Trial2Vec
Parameters
----------
study : dict
Clinical Trial in obtained from the https://clinicaltrials.gov/api/v2/
Returns
-------
pd.Series
Outcome in the format
"""
nct_id = study['protocolSection']['identificationModule']['nctId']
description = study['protocolSection']['descriptionModule']['briefSummary']
title = study['protocolSection']['identificationModule']['officialTitle']
intervention_name = ', '.join(set( j for i in study['protocolSection']['armsInterventionsModule']['armGroups']
for j in i['interventionNames']))
disease = ', '.join(sorted(study['protocolSection']['conditionsModule']['conditions']))
keyword = ', '.join(sorted(study['protocolSection']['conditionsModule']['keywords']))
outcome_measure = ', '.join(set(i['measure'] for i in study['protocolSection']['outcomesModule']['primaryOutcomes']))
criteria = (study['protocolSection']['eligibilityModule']['eligibilityCriteria']
.replace("\n* ", "~").replace("\n", "~").replace("~~", "~"))
reference = ', '.join(set(i['citation'].split(".")[1].lstrip(" ") for i in study['protocolSection']['referencesModule']['references']))
overall_status = study['protocolSection']['statusModule']['overallStatus']
return pd.Series({
'nct_id':nct_id,
'description':description,
'title':title,
'intervention_name':intervention_name,
'disease':disease,
'keyword':keyword,
'outcome_measure':outcome_measure,
'criteria':criteria,
'reference':reference,
'overall_status':overall_status
})
study_dict = getClinicalTrialStudy('NCT03760770')
study_pd = ct_dict2pd(study_dict).to_frame().transpose()
Hi,
I would like to know what is the best way to reshape the output of https://clinicaltrials.gov/api/v2 to be used with Trials2Vec I wrote a quick dirty function (see code below) to get a result to the demo data but I am not sure if the logic applied is 100% correct.