eth-mds / ricu

🏥 ICU data with R 🏥
https://eth-mds.github.io/ricu/
GNU General Public License v3.0
33 stars 11 forks source link

Error while importing in `setup_src_data("mimic")` #29

Closed mjbroerman closed 6 months ago

mjbroerman commented 1 year ago

Hi thanks for the great package!

I verified that I have the data in the data_dir()

Here's a reprex:

ricu::setup_src_data("mimic")
#> The requested tables have already been downloaded
#> ── Importing 23 tables for `mimic` ─────────────────────────────────────────────
#> Warning: Encountered parsing problems for file CHARTEVENTS.csv.gz:
#>   • [7746655, 9]: got '' instead of closing quote at end of file
#>   • [7746655, NA]: got '9 columns' instead of 15 columns
#> • chartevents chunk 1
#> Error in setorderv(dat, sort_col): some columns are not in the data.table: ITEMID

Created on 2023-02-15 with reprex v2.0.2

mjbroerman commented 1 year ago

I think at least CHARTEVENTS.csv.gz is an issue

matt ~ % gunzip repos/mimicIII/CHARTEVENTS.csv.gz
gunzip: repos/mimicIII/CHARTEVENTS.csv.gz: unexpected end of file
gunzip: repos/mimicIII/CHARTEVENTS.csv.gz: uncompress failed
prockenschaub commented 1 year ago

This is odd. Did you download the data using ricu or did you manually download it from physionet?

mjbroerman commented 1 year ago

ricu. Sorry if the second comment was confusing, I moved CHARTEVENTS.csv.gz to the project folder I was working in to see if I could get that file working. See also the remnants of manual extraction I tried, eg. LABEVENTS.csv

(base) matt@Matts-MacBook-Air-2 ~ % ls /Users/matt/Library/Application\ Support/ricu/mimic/
CHARTEVENTS.csv.gz         D_ICD_DIAGNOSES.csv.gz     INPUTEVENTS_MV.csv.gz      PRESCRIPTIONS.csv.gz       callout.fst
CPTEVENTS.csv.gz           D_ICD_PROCEDURES.csv.gz    LABEVENTS.csv              PROCEDUREEVENTS_MV.csv.gz  caregivers.fst
DATETIMEEVENTS.csv.gz      D_ITEMS.csv.gz             MICROBIOLOGYEVENTS.csv.gz  PROCEDURES_ICD.csv.gz      chartevents/
DIAGNOSES_ICD.csv.gz       D_LABITEMS.csv.gz          NOTEEVENTS.csv.gz          SERVICES.csv.gz            d_items.fst
DRGCODES.csv.gz            ICUSTAYS.csv               OUTPUTEVENTS.csv.gz        TRANSFERS.csv.gz           
D_CPT.csv.gz               INPUTEVENTS_CV.csv.gz      PATIENTS.csv.gz            admissions.fst 
prockenschaub commented 1 year ago

Can you try downloading the data as a zip file from physionet, unzip the folder (but not the individual files) and then run import_src("mimic", path_to_mimic) where path_to_mimic points to the directory that contains the downloaded .csv.gz files? I am trying to understand whether this is a problem with ricu or with the downloaded data.

mjbroerman commented 1 year ago

Still working on this, but I think CHARTEVENTS.csv.gz from physionet won't extract.

lucasmiranda42 commented 1 year ago

Hi! Any news on this? :) I'm running into the exact same issue. Thanks!

nbenn commented 9 months ago

@lucasmiranda42 could you maybe share the exact issue you've run into? Have you tried the suggestion by @prockenschaub of downloading the data manually and running the ricu import on that? Maybe the ricu-orchestrated download did not complete successfully? Hard to say what the problem is w/o more concrete information.

lucasmiranda42 commented 9 months ago

Hi! Thank you for your answer, and apologies for the vague comment. I ended up trying an older version of the package (v0.4.0) which worked as intended. I unfortunately don't have the error messages any more.

dplecko commented 6 months ago

I bumped MIMIC-IV to v2.2 today, and re-ran the import / setup of all tables, which worked out fine. If there are any problems with this, please open a new issue, and I will take a look!