rfhb / ctrdata

Aggregate and analyse information on clinical trials from public registers
https://rfhb.github.io/ctrdata/
Other
43 stars 6 forks source link

Manual input of trial IDs instead of search functions #24

Closed machado-t closed 1 year ago

machado-t commented 1 year ago

I have a specific requirement for my workflow: i would need to manual input the trial IDs to populate the database, instead of using the package's built-in functions to search the websites. I didn't find how to do this in the available documentation. Is it possible to do that with this package?

rfhb commented 1 year ago

This is well possible:

dbc <- nodbi::src_sqlite(collection = "myname")

ctIds <- c("NCT00001209", "NCT00001436", "NCT00187109", "NCT01516567", "NCT01471782")

ctrdata::ctrLoadQueryIntoDb(
    queryterm = paste0(ctIds, collapse = "+OR+"),
    register = "CTGOV",
    con = dbc
)

There will be a limit for the number of ids that can be put into queryterm and processed by the registers, thus I recommend to iterate in batches (sizes to be determined).

I may update the documentation for such a case; you can also find a corresponding example amongst the test cases here: https://github.com/rfhb/ctrdata/blob/d658dbd19f2285d34f1968d2fb7e00f6856e3f49/inst/tinytest/ctrdata_euctr.R#L444C6-L444C6

machado-t commented 1 year ago

Great! Thanks for the quick response!

There will be a limit for the number of ids that can be put into queryterm and processed by the registers, thus I recommend to iterate in batches (sizes to be determined).

Did you find any problem around the hundreds, low thousands?

rfhb commented 1 year ago

As mentioned, explore how many ids are reliably imported, than iterate over batches of ids.

rfhb commented 1 year ago

See new and expanded example in vignette here: https://rfhb.github.io/ctrdata/articles/ctrdata_retrieve.html#add-information-using-trial-identifiers

machado-t commented 1 year ago

This is very useful indeed. Thank you very much!