Closed ablack3 closed 2 years ago
@ablack3 Yes, I think this would be much nicer than how we currently do this (both here when referencing the vocabulary tables and elsewhere where we reference cdm tables with patient data). Because it seems like similar functionality would be nice across various analytic packages, maybe your suggested approach could be in its own package? What do you think?
@ablack3 as discussed let's try and incorporate this into a separate package that can become a dependency of CodelistGenerator. When you have a GitHub repo set up for that, let's please transfer this issue to there
Closing as we now have the CDMConnector package
getCandidateCodes
take either a database connection or a directory path that has vocab tables in it. I would like to propose the introduction of a "vocabulary reference" object. This object is a list of Arrow Tables or dplyr table references pointing to a remote database. This object would then work with dplyr verbs andgetCandidateCodes
would accept a single "data" argument that is a vocabulary reference.This also means that assertions for vocabulary table validation can be done only once when the vocabulary reference object is created instead of each time
getCandidateCodes
runs.A draft implementation for creating vocabulary reference object could look like this.
The getCandidateCodes interface would look like
If this seems like a good idea what should we call such objects? Maybe vocref, vocab, vocabReference, or something else?
I'd also like to explore extending this to the CDM as well so we could have CDM reference objects that would be lists of table references to a CDM.
@edward-burn