Closed SSMK-wq closed 4 years ago
If your data is not properly indexed, clustered, etc. this will take a very very long time. Also, this depends on what kind of machine you are running this. Do you actually see the query running in your DBMS?
Hi @jmbanda ,
Yes, my tables are indexed and system has enough memory as shown below. Not sure why it's taking a long time to fetch data for just 200 patients. I understand it depends on system config, but can you let me know how long did it take for you and your system config please?
condition_occurrence | idx_condition_concept_id | CREATE INDEX idx_condition_concept_id ON cdm.condition_occurrence USING btree (condition_concept_id)
-- | -- | --
condition_occurrence | idx_condition_person_id | CREATE INDEX idx_condition_person_id ON cdm.condition_occurrence USING btree (person_id)
condition_occurrence | idx_condition_visit_id | CREATE INDEX idx_condition_visit_id ON cdm.condition_occurrence USING btree (visit_occurrence_id)
condition_occurrence | xpk_condition_occurrence | CREATE UNIQUE INDEX xpk_condition_occurrence ON cdm.condition_occurrence USING btree (condition_occurrence_id)
drug_exposure | idx_drug_concept_id | CREATE INDEX idx_drug_concept_id ON cdm.drug_exposure USING btree (drug_concept_id)
drug_exposure | idx_drug_person_id | CREATE INDEX idx_drug_person_id ON cdm.drug_exposure USING btree (person_id)
drug_exposure | idx_drug_visit_id | CREATE INDEX idx_drug_visit_id ON cdm.drug_exposure USING btree (visit_occurrence_id)
drug_exposure | xpk_drug_exposure | CREATE UNIQUE INDEX xpk_drug_exposure ON cdm.drug_exposure USING btree (drug_exposure_id)
ip_lab_visits | ip_lab_visits_person_id_idx | CREATE INDEX ip_lab_visits_person_id_idx ON cdm.ip_lab_visits USING btree (person_id)
ip_lab_visits | ip_lab_visits_vis_occ_id_idx | CREATE INDEX ip_lab_visits_vis_occ_id_idx ON cdm.ip_lab_visits USING btree (visit_occurrence_id)
measurement | idx_measurement_concept_id | CREATE INDEX idx_measurement_concept_id ON cdm.measurement USING btree (measurement_concept_id)
measurement | idx_measurement_person_id | CREATE INDEX idx_measurement_person_id ON cdm.measurement USING btree (person_id)
measurement | idx_measurement_visit_id | CREATE INDEX idx_measurement_visit_id ON cdm.measurement USING btree (visit_occurrence_id)
measurement | xpk_measurement | CREATE UNIQUE INDEX xpk_measurement ON cdm.measurement USING btree (measurement_id)
observation | idx_observation_concept_id | CREATE INDEX idx_observation_concept_id ON cdm.observation USING btree (observation_concept_id)
observation | idx_observation_person_id | CREATE INDEX idx_observation_person_id ON cdm.observation USING btree (person_id)
observation | idx_observation_visit_id | CREATE INDEX idx_observation_visit_id ON cdm.observation USING btree (visit_occurrence_id)
observation | xpk_observation | CREATE UNIQUE INDEX xpk_observation ON cdm.observation USING btree (observation_id)
observation_period | idx_observation_period_id | CREATE INDEX idx_observation_period_id ON cdm.observation_period USING btree (person_id)
op_visits | op_visits_person_id_idx | CREATE INDEX op_visits_person_id_idx ON cdm.op_visits USING btree (person_id)
patient_visits | patient_visits_person_id_idx | CREATE INDEX patient_visits_person_id_idx ON cdm.patient_visits USING btree (person_id)
person | idx_person_id | CREATE UNIQUE INDEX idx_person_id ON cdm.person USING btree (person_id)
person | xpk_person | CREATE UNIQUE INDEX xpk_person ON cdm.person USING btree (person_id)
person_age | person_age_id_idx | CREATE UNIQUE INDEX person_age_id_idx ON cdm.person_age USING btree (person_id)
visit_occurrence | idx_visit_concept_id | CREATE INDEX idx_visit_concept_id ON cdm.visit_occurrence USING btree (visit_concept_id)
visit_occurrence | idx_visit_occurrence_id | CREATE INDEX idx_visit_occurrence_id ON cdm.visit_occurrence USING btree (visit_occurrence_id)
visit_occurrence | idx_visit_person_id | CREATE INDEX idx_visit_person_id ON cdm.visit_occurrence USING btree (person_id)
And this is my system memory usage
131945672 4989520 104741280 9955008 22214872 116133208
Hello @jmbanda ,
May I kindly check with you on whether you had a chance to look at this issue and the other issues? Would really be helpful to seek your inputs and experiment with APHRODITE at my end
Are all the other constraints also created? Is the function actually running when it is not returning anything?
Yes, all other constraints are in place. I guess you are asking about FK constraints etc. CDM table definition are followed as is.
Yes, the function is running and no output. My total number of unique subjects is 5K
Then it some sort of issue on your system. I would try with 10 patients first to see how long it takes to continue.
Hello @jmbanda ,
I was reading the APHRODITE manual and going through the code files. I see that we have two parameters called
nCases
andnControls
.Though I configure them as 200, I see that getPatientData function is running for more than an hour and no output yet. Moreover, it's only me who is accessing the db and no one has access to it.
In addition our patient's count in db is 5K.
Is this an expected behavior?