Closed rfhb closed 1 month ago
This enhancement has been implemented in branch add_historic_versions
, and this has now been merged into master
. Walkthrough example:
Historic versions can set to be retrieved for CTGOV2 by specifying ctgov2history = <...>
when using ctrLoadQueryIntoDb()
; this functionality was added in ctrdata
version 1.18.0. Historic versions are automatically retrieved for CTIS. The versions include all trial data available at the date of the respective version.
For CTGOV2 records, the historic versions are added as follows into the ctrdata
data model of a trial record, where the ellipsis ...
represents all trial data fields:
{"_id":"NCT01234567", "title": "Current title", ..., "history": [{"history_version": {"version_number": 1, "version_date": "2020-21-22 10:11:12"}, "title": "Original title", ...}, {"history_version": {"number": 2, "date": "2021-22-23 11:13:13"}, "title": "Later title", ...}]}
The example shows how planned or realised number of participants (sample size) changed over time for individual trials, using data from both registers.
# install ctrdata from development branch
remotes::install_github("rfhb/ctrdata")
# load package
library(ctrdata)
# open database
db <- nodbi::src_sqlite(collection = "my_collection")
# read documentation of new
# parameter "ctgov2history"
help("ctrLoadQueryIntoDb")
# load some trials from CTGOV2 specifying that
# for each trial, 10 versions should be retrieved
ctrLoadQueryIntoDb(
queryterm = "https://clinicaltrials.gov/search?cond=neuroblastoma&aggFilters=phase:3,status:com",
con = db,
ctgov2history = 10
)
# * Appears specific for CTGOV REST API 2.0
# * Found search query from CTGOV2: cond=neuroblastoma&aggFilters=phase:3,status:com
# * Checking trials using CTGOV API 2.0, found 24 trials
# (1/3) Downloading in 1 batch(es) (max. 1000 trials each; estimate: 2.4 MB total)
# (2/3) Converting to NDJSON...
# (3/3) Importing records into database...
# JSON file #: 1 / 1
# * Checking historic versions of trial records...
# - Merging trial versions . . . . . . . . . . . . . . . . . . . . . . . .
# - Updating trial records . . . . . . . . . . . . . . . . . . . . . . . .
# Updated 24 trial(s) with historic versions
# = Imported or updated 24 trial(s)
# Updated history ("meta-info" in "my_collection_name")
ctrLoadQueryIntoDb(
queryterm = "https://euclinicaltrials.eu/app/#/search?basicSearchInputAND=cancer&ageGroupCode=2",
con = db
)
result <- dbGetFieldsIntoDf(
fields = c(
# CTGOV2
"history.protocolSection.designModule.enrollmentInfo.count",
"history.history_version",
# CTIS
"applications.submissionDate",
"applications.partI.rowSubjectCount"
),
con = db
)
# helpers
library(dplyr)
library(tidyr)
library(ggplot2)
# mangle and plot
result %>%
unnest(cols = starts_with("history.")) %>%
unnest(cols = starts_with("applications.")) %>%
mutate(version_date = as.Date(version_date)) %>%
mutate(count = dfMergeVariablesRelevel(., colnames = c(
"history.protocolSection.designModule.enrollmentInfo.count",
"applications.partI.rowSubjectCount"))) %>%
mutate(date = dfMergeVariablesRelevel(., colnames = c(
"applications.submissionDate", "version_date"))) %>%
select(`_id`, count, date) %>%
arrange(`_id`, date) %>%
group_by(`_id`) %>%
ggplot(
mapping = aes(
x = date,
y = count,
colour = `_id`)
) +
geom_step() +
geom_point() +
theme_light() +
guides(colour = "none")
Issued closed with 7bc46f983ce50f839b367cf4b122cf53e766d297.
Observation
ctrLoadQueryIntoDb()
from packagectrdata
for this trial.Analysis
ctrdata
accesses the endpoint/studies
of the API specified at https://www.clinicaltrials.gov/data-api/api. This endpoint provides (only) the latest version of trial data that are available. For trials that started or completed enrollment (recruitment), these API endpoint data include only the actual number enrolled.Solution
ctrdata
functionctrLoadQueryIntoDb()
to obtain an additional parameter, e.g.ctgov2history = {1,-1,n,n:m,TRUE}
, which triggers the additional retrieval of a specific (first, last-but-on), a certain number, a range, or all historic versions for trials that are retrieved.CTGOV2
because no corresponding endpoint is available for other registersctrdata
for a given trial could have an additional objecthistory
, e.g.Fields generated by
ctrdata
("_id", "record_last_import", "register", "history", "history_version", "version_number", "version_date") followsnake_case
formatting; other field names are as retrieved from the respective register.