illidanlab / urgent-care-comparative

Predictive Modeling in Urgent Care - A Comparative Study of Machine Learning Approaches
GNU General Public License v3.0
20 stars 6 forks source link

Usability (Deployment) Question #6

Open nasatony opened 5 years ago

nasatony commented 5 years ago

How can I input only one patient's medical history & demographics information into a built model and retrieve the DDX for this one patient ?

Let us assume that the patient medical information is in the same MIMIC-III format, so the model can obtain for example X19 + demo patient input.

It would be great for practical usage & deployment if this capability was possible !

af1tang commented 5 years ago

Hi @nasatony, my apologies for the late reply.

There is a way to do this without making new methods.
First, observe that the main function in main.py returns model, stats. We can simply use this model to predict a single patient's labels.

Suppose we have the i-th patient in our test set of interest, _Xte. We can simply do diagnosis = model.predict(x = [X_te[i].reshape(1, X_te.shape[1], X_te.shape[2]), Z_te[i].reshape(1, Z_te.shape[1])], y= [y_te[i]]).

Note that the above assume that X is a tensor ( m x t x n ), Z (i.e., demographics) is m x n, and y is m x 25 x 1 (i.e., 25 ddx of 1-dimension). You can modify the input dimensions according to your input, auxiliary input, and label shapes. The resulting diagnosis gives the the DDX results for the i-th patient of interest.

nasatony commented 5 years ago

Greetings Andy,

Question 1: Why does your suggestion cause error (see below) ?

Question 2: Since tensor X_te size = ( m x t x n ) and Z_te size = m x n are different, how can you be certain that X_te[i] and Z_te[i] represent the same patients ?

Question 3: What is a pragmatic way of selecting a patient from the test set array X_te ? Maybe, I can list all patient IDS in the test set ... but then how does one determine which i-th array element maps to a particular patientID ?

Question 4: I plan to save the (main method) 'model' as 'model_data.h5', then load/run this model with X_te & Z_te. Would you recommend saving as .h5 or .params/*.json file format ?

Thanks in advance for your great advice !!

main.py [snippet]

i is the patient of interest

i = 10
print('X_te size = ', X_te.size, 'Z_te size = ', Z_te.size)
print('X_te[10] = ', X_te[i])
diagnosis = model.predict(x=[X_te[i].reshape(1, X_te.shape[1], X_te.shape[2]), Z_te[i].reshape(1, Z_te.shape[1])], y=[y_te[i]])

print('diagnosis = ', diagnosis)

$ python main.py --features local_mimic/save/X48.npy --auxiliary_dir local_mimic/save/demo.npy --y_dir local_mimic/save/y --model mlp --task dx --checkpoint_dir local_mimic/save/checkpoint --hidden_size 256 --learning_rate 0.005 --nepochs 10 --batch_size 32 . . . X_te size = 838736 Z_te size = 419368 X_te[0] = [2.67605634e-01 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 8.46153846e-01 6.84931507e-02 4.65116279e-02 3.23529412e-01 2.23214286e-01 8.05084746e-02 7.61904762e-02 1.66666667e-01 1.51515152e-01 3.70370370e-02 5.88235294e-02 1.04477612e-01 2.95238095e-01 4.86486486e-01 5.65217391e-01 9.63414634e-01 5.55555556e-01 1.00000000e+00 7.17948718e-01 8.46153846e-01 1.24809741e-01 1.16279070e-01 8.65546218e-01 5.04807692e-01 4.83050847e-01 3.80952381e-01 1.00000000e+00 7.50000000e-01 8.51851852e-01 7.64705882e-01 1.00000000e+00 2.47947923e-01 2.03626247e-01 3.05361741e-01 4.66125813e-01 4.66655126e-01 6.14258955e-01 2.63142759e-01 8.46153846e-01 9.08591995e-02 6.75816338e-02 5.97832779e-01 4.17293861e-01 1.71058439e-01 1.62891511e-01 5.07016782e-01 5.24673822e-01 5.38323045e-01 3.11121324e-01 4.08504353e-01 8.39861156e-02 3.52281616e-01 3.20681361e-01 5.65323158e-01 3.55074947e-01 6.88380250e-01 4.05538892e-01 2.42441022e-15 2.98725315e-02 3.53325658e-02 4.70849084e-01 1.49978453e-01 2.40048931e-01 1.30292862e-01 4.63238466e-01 2.75120202e-01 5.58984492e-01 3.99084270e-01 5.74645913e-01 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 1.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 1.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 1.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 1.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 1.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00] Traceback (most recent call last): File "main.py", line 332, in diagnosis = model.predict(x=[X_te[i].reshape(1, X_te.shape[1], X_te.shape[2]), Z_te[i].reshape(1, Z_te.shape[1])], y=[y_te[i]]) IndexError: tuple index out of range


From: Andy notifications@github.com Sent: Thursday, March 14, 2019 12:38 PM To: illidanlab/urgent-care-comparative Cc: Antonia Edward Lindsey PhD; Mention Subject: Re: [illidanlab/urgent-care-comparative] Usability (Deployment) Question (#6)

Hi @nasatonyhttps://github.com/nasatony, my apologies for the late reply.

There is a way to do this without making new methods. First, observe that the main function in main.py returns model, stats. We can simply use this model to predict a single patient's labels.

Suppose we have the i-th patient in our test set of interest, X_te. We can simply do diagnosis = model.predict(x = [X_te[i].reshape(1, X_te.shape[1], X_te.shape[2]), Z_te[i].reshape(1, Z_te.shape[1])], y= [y_te[i]]).

Note that the above assume that X is a tensor ( m x t x n ), Z (i.e., demographics) is m x n, and y is m x 25 x 1 (i.e., 25 ddx of 1-dimension). You can modify the input dimensions according to your input, auxiliary input, and label shapes. The resulting diagnosis gives the the DDX results for the i-th patient of interest.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/illidanlab/urgent-care-comparative/issues/6#issuecomment-473027981, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AVnjQ3Wei3ADZDx5zb_4kLZiLWV8Z84Rks5vWqUhgaJpZM4a74aK.

nasatony commented 5 years ago

Are the X input shapes different depending on the auxiliary_dir option ? For example, the following code snippet works for demographics only (i.e. demo.npy):

from keras.models import *

def predict(filename, X, Z): model = load_model(filename)

id = get_id(opts.patient)

result = model.predict(x = [X[id].reshape(1, X.shape[1], X.shape[2]), Z[id].reshape(1, Z.shape[1])])
return result

However, if auxiliary_dir is diagnostic histories only (i.e. w2v.npy) then -

result = model.predict(x = [X[id].reshape(1, X.shape[1], X.shape[2]), Z[id].reshape(1, Z.shape[1])]) IndexError: tuple index out of range

Similar question for diagnostic histories + demographics (i.e. h2v.npy) ?

Thanks in advance !