MIT-LCP / mimic-code

MIMIC Code Repository: Code shared by the research community for the MIMIC family of databases
https://mimic.mit.edu
MIT License
2.41k stars 1.5k forks source link

Troubleshooting Slow MIMIC-III gz Data Import into PostgreSQL Database #1609

Open ccartermices opened 11 months ago

ccartermices commented 11 months ago

I imported the mimic3 data into the database using postpreSQL and 7zip commands:

DROP DATABASE IF EXISTS mimic;

CREATE DATABASE mimic OWNER postgres;
\c mimic;
CREATE SCHEMA mimiciii;
set search_path to mimiciii;
\i D:/postgraduate/doctor/second/mimic3/postgres_create_tables.sql
\set ON_ERROR_STOP 1
\set mimic_data_dir 'D:/postgraduate/doctor/second/mimic3'
\i D:/postgraduate/doctor/second/mimic3/postgres_load_data_7zip.sql

After using these commands, my program stopped updating and got stuck importing the chartevent table for a long time:

mimic=# \i D:/postgraduate/doctor/second/mimic3/postgres_load_data_7zip.sql
COPY 58976
COPY 34499
COPY 7567

My machine hardware is win10, 16GB memory, no GPU, the hard disk storing the csv.gz data has 50GB free space. Is this normal? Are there ways to see the progress or diagnose if it's working correctly?

heisenbug-1 commented 11 months ago

Hi! I think this is normal. This exact behaviour is described here: https://github.com/MIT-LCP/mimic-code/tree/main/mimic-iii/buildmimic/postgres

Chartevents is a large table and it will take a while to load the data

ccartermices commented 11 months ago

I have run it sucessfully. However, the chartevent seems to copy 0. I used the solution in this https://github.com/MIT-LCP/mimic-code/issues/182 issue to count items in chartevents. SELECT COUNT(*) FROM chartevents; However, it run a long time and got stuck.

tompollard commented 11 months ago

I have run it sucessfully. However, the chartevent seems to copy 0.

It looks to me like the table built successfully. As I remember, postgres reports zero rows due to the partitioning.

Side note, but I'd recommend working with the data on Google BigQuery rather than building the database locally: https://mimic.mit.edu/docs/gettingstarted/cloud/bigquery/

ccartermices commented 11 months ago

Thanks!I will try.