eth-mds / ricu

🏥 ICU data with R 🏥
https://eth-mds.github.io/ricu/
GNU General Public License v3.0
33 stars 11 forks source link

Problem loading HIRID dataset #65

Closed KaiWU17TUM closed 1 month ago

KaiWU17TUM commented 2 months ago

Hi, I am trying to load the HIRID dataset. I run the code: library(ricu) attach_src("hirid", data_dir = 'PATH/TO/physionet.org/files/hirid/1.1.1/raw_stage') ricu::src_data_avail("hirid")

But it gives no error but the dataset seems still unattached. name available tables total 1 hirid FALSE 0 5

The folder architecture is like this: physionet.org/ └── files └── hirid └── 1.1.1 ├── imputed_stage ├── merged_stage ├── raw_stage │   ├── observation_tables │   │   └── parquet │   ├── observation_tables_parquet │   ├── pharma_records │   │   └── parquet │   └── pharma_records_parquet └── reference_data

Thanks in advance!

prockenschaub commented 2 months ago

Could you please check whether it is the same issue as #54 ?

KaiWU17TUM commented 2 months ago

Hi, I don't think it is the same problem.

I tried to set the RICU_DATA_PATH explicitly as well as to set the path with attach_src() function. Both give the same output name available tables total 1 hirid FALSE 0 5

I am wondering if the code expects a specific data folder architecture?

dplecko commented 1 month ago

Hi,

did you convert the data to fst format? this is done using the import_src() functionality.

Once you have done so, you should have the files organized as follows in the data_dir() location:

hirid-files

That is, in data_dir() there is folder called hirid. Inside, you should have the fst tables (note that observations and pharma are rather large and are thus partitioned into 15, 2 different chunks, respectively). Once you have this organization in your data_dir(), you should be good to go. Feel free to open a new issue if the problem persists after this fix.