Closed xinyuejohn closed 1 month ago
Hi,
pH
and Temperature
should already be extracted. Please tell me if you find similar results to mine with this piece of code:
import pandas as pd
from blended_preprocessing.omop_conversion import OMOP_converter
self = OMOP_converter(initialize_tables=True,)
fname = 'measurement_1'
df = pd.read_parquet(f'{self.savedir}/measurement/{fname}.parquet')
df_pH = df.loc[df.measurement_concept_id==self.concept_mapping['pH']]
df_temp = df.loc[df.measurement_concept_id==self.concept_mapping['temperature']]
print(f'{df.visit_occurrence_id.nunique()} visits in {fname}' )
print(f'{df_pH.visit_occurrence_id.nunique()} wisits with pH measurement in {fname}')
print(f'{df_temp.visit_occurrence_id.nunique()} visits with temperature measurement in {fname}')
print('\nNumbers of pH measurements from each source datasets:')
print(self.visit_occurrence.loc[df_pH.visit_occurrence_id.values].visit_source_value.map(lambda x: x.split('-')[0]).value_counts())
print('\nNumbers of temp measurements from each source datasets:')
print(self.visit_occurrence.loc[df_pH.visit_occurrence_id.values].visit_source_value.map(lambda x: x.split('-')[0]).value_counts())
You should get something like;
3795 visits in measurement_1
1749 wisits with pH measurement in measurement_1
1059 visits with temperature measurement in measurement_1
Numbers of pH measurements from each source datasets:
visit_source_value
amsterdam 5619
eicu 3891
mimic 3600
hirid 2048
mimic3 1429
Name: count, dtype: int64
Numbers of temp measurements from each source datasets:
visit_source_value
amsterdam 5619
eicu 3891
mimic 3600
hirid 2048
mimic3 1429
Name: count, dtype: int64
As for adding new timeseries measurements, I should probably make a detailed guide... but here are the important steps:
auxillary_files/user_input/timeseries_variables.csv
. This should specify the name of the variable in one or more datasets and the corresponding OMOP concept_idtsp.kept_ts
list, you may check that it is the case. If the variable unit has to be harmonized in between datasets, it can be done in the _harmonize_{dataset}
functions of database_processing/timeseries_preprocessing.py
.OMOP_concerter.unit_mapping
Please tell me if you need assistance in doing so.
Thanks for your answer! I will probably try to add more timeseries measurements using your steps next week.
Great ! If you do, feel free to submit a pull request
Hi, I was wondering if I can config additional measurements that I want to add to final OMOP measurement table?
For example, I want to extract
pH
andTemperature
from MIMIC-III as well. Could you tell me if there's easy way to achieve so?Thank you!