MIT-LCP / mimic-code

MIMIC Code Repository: Code shared by the research community for the MIMIC family of databases
https://mimic.mit.edu
MIT License
2.55k stars 1.51k forks source link

tables seem to be created and compressed files loaded (step 6) but dataset dos not seem to be working (step 7) #1645

Open iamjingxian opened 1 year ago

iamjingxian commented 1 year ago

Prerequisites

I am stuck at the last two steps in uploading and creating datasets in google chloud. https://github.com/MIT-LCP/mimic-iv/tree/master/buildmimic/bigquery

step 6 seemed to work

step 7 was not successful

How can I proceed to create the dataset?

1st screenshot. upload_mimic4_v1_0.sh seemed to work

image

2nd screenshot. table not found; unable to validate dataset works

image

3rd screenshot. location of dataset is in US image

4th screenshot. dataset indeed does not contain the anticipated/required tables image

tompollard commented 1 year ago

@iamjingxian Do you really need to load the data onto BigQuery? It is already on there, available in the physionet-data project.

To access the data:

  1. go to https://physionet.org/content/mimiciv/2.2/#files (or whatever module you are interested in)
  2. click request access on BigQuery. this adds you to the permissions group.
  3. refresh the physionet-data bucket to view the dataset
iamjingxian commented 1 year ago

Hi Tom, thanks for the reply!

I need to generate other tables/schemas, and hence followed the instructions to create mimic4_v1_0.

i am also encountering issues accessing the physionet-data bucket on google cloud. there's this error about "no billing accounts linked" which i am trying to resolve with google cloud. it keeps reverting back to "no accounts linked" even after i have selected the project to link it to.

Separately, I am able to see physionet-data and the derived tables. (4th screenshot it is truncated in the preview; full view of the screenshot shows that physionet-data is loaded) but my understanding is that this is a query-only option, and does not allow me to create other schemas/tables.

alistairewj commented 1 year ago

You need to link your Google project to a billing account on Google. Once you create a billing account (with your payment information) you can attach it to your project. Then any resources consumed by that project are billed to the linked billing account. It basically lets you have one set of billing data for more than one project.

For the tables, you are correct that everything under physionet-data is read only, including the derived tables. Once you are set up, you can create new BigQuery datasets (aka schemas) under your project and then new tables within those BigQuery datasets.

iamjingxian commented 1 year ago

Hi Alistair, thanks for the reply! I looks like I already had the payment account set up. (see screenshot attached).

Or am I mistaken, and am actually missing specific steps?

image