TheJacksonLaboratory / MIMIC_HPO

0 stars 0 forks source link

Get lab summary from Java #11

Closed kingmanzhang closed 5 years ago

kingmanzhang commented 5 years ago

@aeyates Since you have the dababase set up, I suggest we can go through the transformation process together to prepare the data for analysis. I will make requests to let you revise/optimize my implementation or start from scratch. Hopefully, this will save your time as well.

Step 1: Get reference ranges for normal Qn lab findings I did this by manually running a sql query. Could you help implement this class (link to sql query provided in class documentation)? https://github.com/TheJacksonLaboratory/MIMIC_HPO/blob/develop/src/main/java/org/jax/io/QnLabSummarizeDbImpl.java

aeyates commented 5 years ago

@kingmanzhang Sounds good. I was just looking through the LabSummary piece to understand what you did. I see a command line implementation where you read directly from the CSV (SummarizeLabCmd), but hadn't found the sql where you refined this. I'll let you know if I have questions.

kingmanzhang commented 5 years ago

Yes, I started trying to compute by myself, which is a total waste of my time because a few lines of sql query does the job perfectly (https://github.com/TheJacksonLaboratory/MIMIC_HPO/blob/master/src/main/resources/sql/analysis.sql). I did not set up Spring/JPA properly so I manually run the query and haven't got a chance to update it.

aeyates commented 5 years ago

Great, I will probably replace that Command Line implementation then. I have Hibernate set up and working to query lab events, though I'm not sure how much overhead it will add to use it as opposed to straight SQL queries. LabEvents is a very large table, and just getting a count takes about a minute.

kingmanzhang commented 5 years ago

Do not forget to run the SQL file for creating table indices.

aeyates commented 5 years ago

What SQL file are you referring to for creating indices? To import data, I used the define.sql script. But I've encountered some issues with the table names being capitalized, as Unix based systems are case sensitive and other queries refer to the lower-cased name. I loaded the database in MySQL5.7, but some of the queries in the analysis script will only work in MySQL 8, and my upgrade failed. I'm trying to reload everything in MySQL8 now. I'd suggest lowercasing all the table names for better consistency across platforms.

kingmanzhang commented 5 years ago

If you are using the define.sql within this repo, only selected tables are imported. The define.sql file is a subset of the define.sql file here: https://github.com/MIT-LCP/mimic-code/tree/master/buildmimic/mysql They also provided a index.sql file for creating indices (same place as above). I run a couple of queries in this file to speed up things. But I forgot to create a file for those indexes that I ran (Sorry).

aeyates commented 5 years ago

Thanks, I had done a little searching to see if there were already scripts for converting to MySQL on the web but hadn't found this reference, so I was only going by what was in your repository. I'll try to get things set up properly, as queries are very painful.

kingmanzhang commented 5 years ago

You could run the entirety of define.sql and index.sql if you do not have space limit. I hosted the database on my laptop so I just imported the tables that I need.