Closed zsenousy closed 2 months ago
@razekmh @AngharadGreen
Mapping Details:
PERSON: The patients table in RAMSES corresponds directly with the OMOP PERSON table. Here, the person_id
from OMOP is mapped to patient_id
in RAMSES. Additional patient demographics such as gender_concept_id
(OMOP) align with sex
field in RAMSES.
OBSERVATION_PERIOD: The OMOP OBSERVATION_PERIOD table captures the time span during which the patient is observed (e.g., start and end dates). While RAMSES does not have a direct equivalent table, this can be inferred using the ward_start
and ward_end
fields in inpatient_ward_movements and/or the admission_date
and discharge_date
in inpatient_episodes. Any other thoughts?
DEATH: In OMOP, death_date
is recorded in the DEATH table. This information is captured in RAMSES under date_of_death
in the patients table.
Mapping Details:
CONDITION_OCCURRENCE: The inpatient_diagnoses table in RAMSES captures inpatient diagnoses and can be mapped to OMOP’s CONDITION_OCCURRENCE table. Here, condition_concept_id
from OMOP corresponds to icd_code
in RAMSES, with fields like diagnosis_position
offering additional context.
DRUG_EXPOSURE: This OMOP table records drug exposure events and corresponds to drug_prescriptions and drug_therapy_episodes in RAMSES. Fields such as drug_concept_id
(OMOP) map to ATC_code
and prescription_text
in RAMSES. Any thoughts?
PROCEDURE_OCCURRENCE: Procedures performed on patients in OMOP can be mapped to inpatient_episodes in RAMSES. While OMOP uses procedure_concept_id
, in RAMSES, this information might be inferred using fields related to specific care episodes, such as episode_number
and main_specialty_code
.
DEVICE_EXPOSURE: OMOP’s DEVICE_EXPOSURE table captures data on devices used on patients. RAMSES does not have a direct equivalent for device data.
OBSERVATION: General observations in OMOP, not covered by other tables, can be mapped to inpatient_investigations in RAMSES, which records various types of observations conducted during inpatient care. Not sure about this, any thoughts?
MEASUREMENT: OMOP’s MEASUREMENT table captures laboratory results and vital signs. In RAMSES, this can be split between:
specimen_type_code
, corresponding to lab samples.Mapping Details:
MEASUREMENT: As explained above, this is split between:
SPECIMEN: This table in OMOP directly corresponds to microbiology_specimens in RAMSES, mapping fields like specimen_concept_id
to specimen_type_code
.
Mapping Details:
VISIT_OCCURRENCE: This OMOP table captures each healthcare encounter, and it can be mapped to inpatient_ward_movements in RAMSES. Fields like visit_start_date
and visit_end_date
in OMOP align with ward_start
and ward_end
in RAMSES.
CARE_SITE and PROVIDER: RAMSES does not seem to have direct equivalents for these tables.
Gaps in RAMSES Database:
Thank you for this @zsenousy. I'm creating a visualisation of the OMOP and RAMSES mappings as well
I am using omock to match the OMOP data to Ramses. My target in the mapping is to match the structure of the dummy data Ramses is using in the validate article. I am starting the matching with the two table drug_prescriptions
and drug_administrations
. There are more tables in the dummy data but I aiming for these two in the first instance.
> str(drug_prescriptions)
'data.frame': 367 obs. of 12 variables:
$ patient_id : chr "5124578766" "4874231672" "6292626973" "6292626973" ...
$ prescription_id : chr "66cac1c5eab88d72c8b7687966357f5b" "8ccd67f4730b62ceafb8bcb27996c10c" "72cf4b592b0f4e2143b4bb9d7c569c97" "806c86b55cf50505a20b722f081c4075" ...
$ rxsummary : chr "Piperacillin / Tazobactam IVI 4.5 g TDS" "Ciprofloxacin ORAL 500 mg BD" "Flucloxacillin ORAL 500 mg 6H" "Metronidazole ORAL 400 mg TDS" ...
$ authoring_date : POSIXct, format: "2015-08-04 13:07:16" "2017-07-06 09:49:31" "2017-01-13 17:34:41" "2017-11-12 09:09:13" ...
$ prescription_start: POSIXct, format: "2015-08-04 14:45:16" "2017-07-06 10:26:31" "2017-01-13 18:48:41" "2017-11-12 09:44:13" ...
$ prescription_end : POSIXct, format: "2015-08-07 14:45:16" "2017-07-07 22:26:31" "2017-01-17 18:48:41" "2017-11-14 09:44:13" ...
$ tr_DESC : chr "Piperacillin / Tazobactam" "Ciprofloxacin" "Flucloxacillin" "Metronidazole" ...
$ route : chr "IV" "ORAL" "ORAL" "ORAL" ...
$ dose : num 4.5 500 500 400 4.5 500 4.5 4.5 600 2 ...
$ units : chr "g" "mg" "mg" "mg" ...
$ frequency : chr "TDS" "BD" "6H" "TDS" ...
$ duration : num 3 1.5 4 2 4 3 3 2 1 2 ...
> str(drug_administrations)
'data.frame': 2818 obs. of 7 variables:
$ patient_id : chr "5124578766" "5124578766" "5124578766" "5124578766" ...
$ prescription_id : chr "66cac1c5eab88d72c8b7687966357f5b" "66cac1c5eab88d72c8b7687966357f5b" "66cac1c5eab88d72c8b7687966357f5b" "66cac1c5eab88d72c8b7687966357f5b" ...
$ tr_DESC : chr "Piperacillin / Tazobactam" "Piperacillin / Tazobactam" "Piperacillin / Tazobactam" "Piperacillin / Tazobactam" ...
$ route : chr "IV" "IV" "IV" "IV" ...
$ dose : num 4.5 4.5 4.5 4.5 4.5 4.5 4.5 4.5 4.5 500 ...
$ units : chr "g" "g" "g" "g" ...
$ administration_date: POSIXct, format: "2015-08-04 14:45:16" "2015-08-04 23:45:16" "2015-08-05 08:45:16" "2015-08-05 17:45:16" ...
I am creating a dummy dataset using omock by passing in the following code. The variables are set to default examples from the omock code and comments. I am also creating a ramses dataset to match against
library(Ramses)
library(omock)
library(dplyr)
ramses_db <- connect_local_database("ramses-db.duckdb")
cdm <- mockCdmReference() |>
mockPerson() |>
mockObservationPeriod() |>
mockConditionOccurrence(recordPerson = 10) |>
mockDrugExposure(recordPerson = 40) |>
mockMeasurement(recordPerson = 5) |>
mockDeath(recordPerson = 1)
Tackling the drug_prescription
table starts from the drug_exposure
table from omock.
> cdm$drug_exposure
# A tibble: 180 × 6
drug_concept_id person_id drug_exposure_start_date drug_exposure_end_date drug_exposure_id drug_type_concept_id
* <dbl> <int> <date> <date> <int> <dbl>
1 10 9 1991-07-23 1995-03-08 1 1
2 10 4 1994-10-23 2003-10-24 2 1
3 10 7 2014-03-15 2014-03-22 3 1
4 10 1 2002-04-13 2004-12-14 4 1
5 10 2 1999-12-08 2001-02-27 5 1
6 10 7 2014-03-13 2014-03-26 6 1
7 10 2 1999-04-22 2002-11-26 7 1
8 10 3 2015-03-01 2015-05-28 8 1
9 10 1 2002-08-13 2005-10-05 9 1
10 10 5 2012-05-16 2012-09-02 10 1
# ℹ 170 more rows
# ℹ Use `print(n = ...)` to see more rows
We can see that drug_concept_id
is the attribute is present which is representing
A foreign key to the predefined Concept identifier in the Standardized Vocabularies reflecting the type of Drug Exposure recorded. It indicates how the Drug Exposure was represented in the source data.
However the attributes effective_drug_dose
and dose_unit_concept_id
are missing. These two attributes are not required in the OMOP standard so it is acceptable they are missing in the mock data.
Tracing the drug_concept_id
to its primary table leads to the concept
table.
> cdm$concept
# A tibble: 3,245 × 10
concept_id concept_name domain_id vocabulary_id standard_concept concept_class_id concept_code valid_start_date
* <dbl> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 1 Musculoskele… Condition SNOMED S Clinical Finding 1234 NA
2 2 Osteoarthros… Condition SNOMED S Clinical Finding 1234 NA
3 3 Arthritis Condition SNOMED S Clinical Finding 1234 NA
4 4 Osteoarthrit… Condition SNOMED S Clinical Finding 1234 NA
5 5 Osteoarthrit… Condition SNOMED S Clinical Finding 1234 NA
6 6 Osteonecrosis Condition SNOMED S Clinical Finding 1234 NA
7 7 Degenerative… Condition Read NA Diagnosis 1234 NA
8 8 Knee osteoar… Condition Read NA Diagnosis 1234 NA
9 9 H/O osteoart… Observat… LOINC S Observation 1234 NA
10 10 Adalimumab Drug RxNorm S Ingredient 1234 NA
# ℹ 3,235 more rows
# ℹ 2 more variables: valid_end_date <chr>, invalid_reason <chr>
# ℹ Use `print(n = ...)` to see more rows
The concept
table contains more than just the drugs reference. We can inspect the list of the concept domain id
> unique(cdm$concept$domain_id)
[1] "Condition" "Observation" "Drug" "Ethnicity" "Gender" "Race" "Unit" "Visit"
[9] "Measurement"
We can filter the concept
table to show only the drug
concepts
> cdm$concept |> filter(domain_id == 'Drug')
# A tibble: 6 × 10
concept_id concept_name domain_id vocabulary_id standard_concept concept_class_id concept_code valid_start_date
<dbl> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 10 Adalimumab Drug RxNorm S Ingredient 1234 NA
2 11 Injection Drug OMOP NA Dose Form 1234 NA
3 12 ALIMENTARY TR… Drug ATC NA ATC 1st 1234 NA
4 13 Descendant dr… Drug RxNorm S Drug 1234 NA
5 14 Injectable Drug OMOP NA Dose Form 1234 NA
6 19 Other ingredi… Drug RxNorm S Ingredient 1234 NA
# ℹ 2 more variables: valid_end_date <chr>, invalid_reason <chr>
Let's the save this table to prepare it for more processing
> drug_concepts <- cdm$concept |> filter(domain_id == 'Drug')
One of the steps of the validation process is to match the drug codes using the AMR
package. As follows.
# attempting to map drug name using AMR package
drug_prescriptions$drug_code <- AMR::as.ab(drug_prescriptions$tr_DESC)
We can try the same with the list of drugs that were found in the concept
table
> drug_concepts$drug_code <- drug_concepts$concept_name |> AMR::as.ab()
Warning message:
in as.ab(): these values could not be coerced to a valid antimicrobial ID: "Adalimumab", "Injectable",
"Injection", and "Other ingredient".
We get a partial match as seen in the table
> drug_concepts |> select(concept_id,concept_name, drug_code)
# A tibble: 6 × 3
concept_id concept_name drug_code
<dbl> <chr> <ab>
1 10 Adalimumab NA
2 11 Injection NA
3 12 ALIMENTARY TRACT AND METABOLISM PPA
4 13 Descendant drug SLF9
5 14 Injectable NA
6 19 Other ingredient NA
Maybe manual editing is required for this step? as in the example from validate article
# editing drug names
drug_prescriptions$drug_code <- gsub("Vancomycin protocol",
"Vancomycin",
drug_prescriptions$tr_DESC)
# mapping drug name using AMR package
drug_prescriptions$drug_code <- AMR::as.ab(drug_prescriptions$drug_code)
drug_prescriptions$drug_name <- AMR::ab_name(drug_prescriptions$drug_code)
I suspect the actual data will have more drugs to match.
Going back to the drug_exposure
table, we could now extract the rows which matches the concept_id
from our newly created drug_concepts
table.
> cdm$drug_exposure |> filter(drug_concept_id %in% drug_concepts$concept_id)
# A tibble: 2,400 × 6
drug_concept_id person_id drug_exposure_start_date drug_exposure_end_date drug_exposure_id drug_type_concept_id
<dbl> <int> <date> <date> <int> <dbl>
1 10 9 1987-03-04 2004-10-31 1 1
2 10 4 1997-04-16 2005-03-29 2 1
3 10 7 2014-03-09 2014-03-24 3 1
4 10 1 2004-05-16 2008-08-06 4 1
5 10 2 1999-11-17 2001-07-19 5 1
6 10 7 2014-03-13 2014-03-22 6 1
7 10 2 1999-10-06 2001-08-11 7 1
8 10 3 2015-01-26 2015-05-26 8 1
9 10 1 2002-04-04 2007-06-28 9 1
10 10 5 2012-05-13 2012-06-28 10 1
# ℹ 2,390 more rows
# ℹ Use `print(n = ...)` to see more rows
2,400
are all the rows of drug_exposure
> cdm$drug_exposure
# A tibble: 2,400 × 6
Inspecting the distribution of the drug codes in the drug_exposure
we find uniform distribution
> cdm$drug_exposure |>
+ filter(drug_concept_id %in% drug_concepts$concept_id) |>
+ group_by(drug_concept_id) |>
+ summarise(count = n())
# A tibble: 6 × 2
drug_concept_id count
<dbl> <int>
1 10 400
2 11 400
3 12 400
4 13 400
5 14 400
6 19 400
@AngharadGreen @zsenousy
I believe the next steps are to explore the data in one of the example datasets and try to use it to replicate the steps in the article on Validate and load electronic health records. Some patterns and problems are expected to arise from this process. Since we do not have access to the actual data yet, we are not aiming for a full solution yet. However a general understanding of the feasibility of matching the two data models is what we aim for now.
As you might see above I started working with two data frames provided by RAMSES as examples for what the EHR data would look like. I think it is worth it to try to match the data structure of both data frames from OMOP data from either example dataset we have access to.
Working with omock
so far has shown that we could get some insights into how the data is organised in OMOP. For example we know that the drug names are standardised using the concept
table. Also we can see that the vocabulary_id
is either RxNorm
, ATC
, or OMOP
which seems to follow different standards by itself. Maybe we could dig a little bit into this, keeping in mind that the hospital might be using one specific standard or a mix (we do not know, but we could try and ask)
> cdm$concept |> filter(domain_id == 'Drug')
# A tibble: 6 × 10
concept_id concept_name domain_id vocabulary_id standard_concept concept_class_id concept_code valid_start_date
<dbl> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 10 Adalimumab Drug RxNorm S Ingredient 1234 NA
2 11 Injection Drug OMOP NA Dose Form 1234 NA
3 12 ALIMENTARY TR… Drug ATC NA ATC 1st 1234 NA
4 13 Descendant dr… Drug RxNorm S Drug 1234 NA
5 14 Injectable Drug OMOP NA Dose Form 1234 NA
6 19 Other ingredi… Drug RxNorm S Ingredient 1234 NA
# ℹ 2 more variables: valid_end_date <chr>, invalid_reason <chr>
Also we can see that the step of matching the drug names using the AMR
package, we got only a partial match
> drug_concepts |> select(concept_id,concept_name, drug_code)
# A tibble: 6 × 3
concept_id concept_name drug_code
<dbl> <chr> <ab>
1 10 Adalimumab NA
2 11 Injection NA
3 12 ALIMENTARY TRACT AND METABOLISM PPA
4 13 Descendant drug SLF9
5 14 Injectable NA
6 19 Other ingredient NA
I am not sure if this step will be absolutely necessary but based on the note in the article is seems to be essential to extract the DDD.
Using the AMR package, it is possible to infer important properties of antibacterial/antifungal drugs:
- ATC classes and groups, which can be useful for instance in order to identify antifungals (group ‘Antimycotics for systemic use’)
- Defined Daily Doses (DDDs), which are commonly used to measure drug consumption.
I would suggest we try the same steps with the data from the big data open dataset. The concept table has 3,902,588 entries which holds much more potential than the omock dataset. @AngharadGreen Please refer to #15 on accessing the data. Please comment here if you face any issues with accessing the data
@razekmh @AngharadGreen
Initial mapping RAMSES - OMOP
1. Patient Information
OMOP CDM Tables:
- PERSON
- OBSERVATION_PERIOD
- DEATH
RAMSES DB Equivalent:
- patients
Mapping Details:
- PERSON: The patients table in RAMSES corresponds directly with the OMOP PERSON table. Here, the
person_id
from OMOP is mapped topatient_id
in RAMSES. Additional patient demographics such asgender_concept_id
(OMOP) align withsex
field in RAMSES.- OBSERVATION_PERIOD: The OMOP OBSERVATION_PERIOD table captures the time span during which the patient is observed (e.g., start and end dates). While RAMSES does not have a direct equivalent table, this can be inferred using the
ward_start
andward_end
fields in inpatient_ward_movements and/or theadmission_date
anddischarge_date
in inpatient_episodes. Any other thoughts?- DEATH: In OMOP,
death_date
is recorded in the DEATH table. This information is captured in RAMSES underdate_of_death
in the patients table.2. Clinical Events
OMOP CDM Tables:
- CONDITION_OCCURRENCE
- DRUG_EXPOSURE
- PROCEDURE_OCCURRENCE
- DEVICE_EXPOSURE
- OBSERVATION
- MEASUREMENT
RAMSES DB Equivalent:
- inpatient_diagnoses
- drug_prescriptions
- drug_therapy_episodes
- inpatient_episodes
- microbiology_specimens
- microbiology_isolates
- inpatient_investigations
Mapping Details:
- CONDITION_OCCURRENCE: The inpatient_diagnoses table in RAMSES captures inpatient diagnoses and can be mapped to OMOP’s CONDITION_OCCURRENCE table. Here,
condition_concept_id
from OMOP corresponds toicd_code
in RAMSES, with fields likediagnosis_position
offering additional context.- DRUG_EXPOSURE: This OMOP table records drug exposure events and corresponds to drug_prescriptions and drug_therapy_episodes in RAMSES. Fields such as
drug_concept_id
(OMOP) map toATC_code
andprescription_text
in RAMSES. Any thoughts?- PROCEDURE_OCCURRENCE: Procedures performed on patients in OMOP can be mapped to inpatient_episodes in RAMSES. While OMOP uses
procedure_concept_id
, in RAMSES, this information might be inferred using fields related to specific care episodes, such asepisode_number
andmain_specialty_code
.- DEVICE_EXPOSURE: OMOP’s DEVICE_EXPOSURE table captures data on devices used on patients. RAMSES does not have a direct equivalent for device data.
- OBSERVATION: General observations in OMOP, not covered by other tables, can be mapped to inpatient_investigations in RAMSES, which records various types of observations conducted during inpatient care. Not sure about this, any thoughts?
MEASUREMENT: OMOP’s MEASUREMENT table captures laboratory results and vital signs. In RAMSES, this can be split between:
- microbiology_specimens: Captures the specimen details such as
specimen_type_code
, corresponding to lab samples.- microbiology_isolates: Records specific organism-related findings, which is crucial for microbiological data.
3. Microbiology and Laboratory Data
OMOP CDM Tables:
- MEASUREMENT
- SPECIMEN
RAMSES DB Equivalent:
- microbiology_isolates
- microbiology_specimens
Mapping Details:
MEASUREMENT: As explained above, this is split between:
- microbiology_specimens: To capture details about the specimen collected.
- microbiology_isolates: To record specific lab results related to microbiology, such as organism codes and susceptibility results.
- SPECIMEN: This table in OMOP directly corresponds to microbiology_specimens in RAMSES, mapping fields like
specimen_concept_id
tospecimen_type_code
.4. Healthcare Encounters
OMOP CDM Tables:
- VISIT_OCCURRENCE
- CARE_SITE
- PROVIDER
RAMSES DB Equivalent:
- inpatient_ward_movements
Mapping Details:
- VISIT_OCCURRENCE: This OMOP table captures each healthcare encounter, and it can be mapped to inpatient_ward_movements in RAMSES. Fields like
visit_start_date
andvisit_end_date
in OMOP align withward_start
andward_end
in RAMSES.- CARE_SITE and PROVIDER: RAMSES does not seem to have direct equivalents for these tables.
5. Missing or Unmapped Items
Gaps in RAMSES Database:
- DEVICE_EXPOSURE: No direct equivalent table for device data, indicating a missing component.
- OBSERVATION_PERIOD: Needs to be inferred as there is no direct mapping.
- CARE_SITE and PROVIDER: No tables capturing this specific information. I believe this can be ignored but we need confirmation from medical professionals (Use cases), if this required.
I have put together this visualisation to map the OMOP tables to Ramses tables: OMOK vs RAMSES.drawio.pdf
I have found this website is very useful for understanding the OMOP CDM - https://ohdsi.github.io/CommonDataModel/cdm54.html
I would suggest we try the same steps with the data from the big data open dataset. The concept table has 3,902,588 entries which holds much more potential than the omock dataset. @AngharadGreen Please refer to #15 on accessing the data. Please comment here if you face any issues with accessing the data
@razekmh I have struggled trying to access the data from BigQuery public datasets but have found this https://github.com/OHDSI/ETL-CMS.git and I am working through the instructions to download the dataset as it's the same one on the BigQuery public datasets
@razekmh @AngharadGreen
library(Ramses)
library(omock)
library(dplyr)
library(AMR) # For drug name mapping
ramses_db <- connect_local_database("ramses-db.duckdb")
cdm <- mockCdmReference() |>
mockPerson() |>
mockObservationPeriod() |>
mockConditionOccurrence(recordPerson = 10) |>
mockDrugExposure(recordPerson = 40) |>
mockMeasurement(recordPerson = 5) |>
mockDeath(recordPerson = 1)
drug_prescriptions
tabledrug_prescriptions <- data.frame(
patient_id = c("5124578766", "4874231672", "6292626973", "6292626973"),
prescription_id = c("66cac1c5eab88d72c8b7687966357f5b", "8ccd67f4730b62ceafb8bcb27996c10c", "72cf4b592b0f4e2143b4bb9d7c569c97", "806c86b55cf50505a20b722f081c4075"),
rxsummary = c("Piperacillin / Tazobactam IVI 4.5 g TDS", "Ciprofloxacin ORAL 500 mg BD", "Flucloxacillin ORAL 500 mg 6H", "Metronidazole ORAL 400 mg TDS"),
authoring_date = as.POSIXct(c("2015-08-04 13:07:16", "2017-07-06 09:49:31", "2017-01-13 17:34:41", "2017-11-12 09:09:13")),
prescription_start = as.POSIXct(c("2015-08-04 14:45:16", "2017-07-06 10:26:31", "2017-01-13 18:48:41", "2017-11-12 09:44:13")),
prescription_end = as.POSIXct(c("2015-08-07 14:45:16", "2017-07-07 22:26:31", "2017-01-17 18:48:41", "2017-11-14 09:44:13")),
tr_DESC = c("Piperacillin / Tazobactam", "Ciprofloxacin", "Flucloxacillin", "Metronidazole"),
route = c("IV", "ORAL", "ORAL", "ORAL"),
dose = c(4.5, 500, 500, 400),
units = c("g", "mg", "mg", "mg"),
frequency = c("TDS", "BD", "6H", "TDS"),
duration = c(3, 1.5, 4, 2)
)
omop_drug_exposure <- drug_prescriptions %>%
mutate(
# Mapping patient_id to OMOP's person_id
person_id = patient_id,
# Prescription ID in RAMSES maps to drug_exposure_id in OMOP
drug_exposure_id = prescription_id,
# Authoring date and prescription start/end in RAMSES map to drug_exposure_start/end in OMOP
drug_exposure_start_date = prescription_start,
drug_exposure_end_date = prescription_end,
# Use AMR package to map drug names (tr_DESC) to OMOP's drug_concept_id
drug_code = AMR::as.ab(tr_DESC),
# Use route mapping (IV, ORAL, etc.) to route_concept_id
route_concept_id = case_when(
route == "IV" ~ 4132165, # Concept ID for intravenous route (example)
route == "ORAL" ~ 4122237, # Concept ID for oral route (example)
TRUE ~ NA_real_ # If unknown, NA
),
# Dose and units map to dose_value and dose_unit_concept_id in OMOP
dose_value = dose,
dose_unit_concept_id = case_when(
units == "g" ~ 8576, # Concept ID for grams (example)
units == "mg" ~ 8577, # Concept ID for milligrams (example)
TRUE ~ NA_real_
),
# Map frequency (TDS, BD, etc.) to free-text sig in OMOP, or standard mapping if available
sig = paste0(dose, units, " ", frequency),
# Calculate the duration of the drug exposure
duration_days = as.numeric(difftime(prescription_end, prescription_start, units = "days"))
)
print(omop_drug_exposure)
validate_drug_mapping <- function(df) {
if (all(!is.na(df$drug_code))) {
message("All drugs successfully mapped to AMR codes!")
} else {
message("Some drug mappings failed. Please check the following:")
print(df %>% filter(is.na(drug_code)))
}
}
validate_drug_mapping(omop_drug_exposure)
validate_prescriptions(omop_drug_exposure)
dbWriteTable(ramses_db, "drug_exposure", omop_drug_exposure, overwrite = TRUE)
loaded_data <- dbReadTable(ramses_db, "drug_exposure")
print(loaded_data)
dbDisconnect(ramses_db)
Edit: @razekmh is modifying the description of this issue to list out the task and our current understanding.
The aim of this issue is to gain a full understanding of how does OMOP data model translate to RAMSES data model.
A more specific target would be to add OMOP data to RAMSES such that we would be able to export a report of the Defined Daily Dose (DDD) per drug? per 1000 bed days for both prescription and administration per ward and per specialisation
Naturally, achieving this task requires us to have access to an accurate data model for each the origin and target standards. We will tackle each separately here:
RAMSES data model is partially described in the article on Objects and classes in Ramses . The article does not explain how import the data into RAMSES. We can see a glimpse of the data expected by RAMSES in the article on Validate and load electronic health records. in the validation article we can see that RAMSES offers some tools to process the EHR data and prepare it to be consumed by RAMSES. Our understanding of what does RAMSES expects will develop as we replicate the steps in the article on OMOP data.
OMOP is a standard by itself which means we can read on the structure in the official website. However, the actual data structure we will work with depends on the implantation in UCLH. This is due to two facts:
We try approximate the expected data structure from OMOP using two sources:
Each dataset/tool has its own limitations but we expect that working with them will help us at least partially understand the OMOP mapping to RAMSES while we are waiting for the access to the UCLH OMOP data. Accessing these dataset is described in issue #15
Here we describe the mapping of OMOP Data tables into RAMSES DB. We identify the mapping required between various attributes and the missing data items that need to be derived.