Medical-Event-Data-Standard / meds_etl

A collection of ETLs from common data formats to Medical Event Data Standard
Apache License 2.0
16 stars 3 forks source link

Test that the MIMIC-IV ETL runs on the MIMIC-IV demo dataset #4

Closed tompollard closed 6 months ago

tompollard commented 6 months ago

As discussed in https://github.com/Medical-Event-Data-Standard/meds_etl/issues/1, we would like to add some tests to confirm that the MIMIC ETL runs as expected.

This pull request:

tompollard commented 6 months ago

Sorry for the churn! The test now runs:

tompollard commented 6 months ago

My main request is that we want the test to at minimum try to load the resulting files with huggingface Datasets.

Thanks Ethan. I added a couple of simple tests as suggested.

tompollard commented 6 months ago

Hmm, looks like the build was broken by a recent update in meds:

https://github.com/Medical-Event-Data-Standard/meds/commit/e93f63a2f9642123c49a31ecffcdb84d877dc54a

So successful test?

EthanSteinberg commented 6 months ago

Good catch and sorry about that! I pushed an update to the meds package which should fix this.

https://pypi.org/project/meds/0.1.1/

Now the pyproject.toml file needs to be updated to require 0.1.1 instead of 0.1

Also, we shouldn't use subprocess here since it results in terrible error messages.

We should just import main from the corresponding module.

tompollard commented 6 months ago

Now the pyproject.toml file needs to be updated to require 0.1.1 instead of 0.1

Thanks, pull request updated and tests are passing again.

Also, we shouldn't use subprocess here since it results in terrible error messages. We should just import main from the corresponding module.

I think this change belongs in a new pull request, but if you let me know how you'd like to implement it then I can add it here. The ETL needs refactoring beforehand really.

EthanSteinberg commented 6 months ago

I think this change belongs in a new pull request, but if you let me know how you'd like to implement it then I can add it here. The ETL needs refactoring beforehand really.

Yep, let's just merge this as is.

tompollard commented 6 months ago

Thanks for reviewing!