Optum / retain-keras

Reimplementation of RETAIN Recurrent Neural Network in Keras
Apache License 2.0
83 stars 31 forks source link

Is the ROC-AUC score calculation correct? #30

Open bliu188 opened 2 years ago

bliu188 commented 2 years ago

I am testing the RETAIN model but puzzled by the difference between ROC-AUC and accuracy. The value for ROC-AUC means a nearly non-discriminative model: Epoch: 9 - ROC-AUC: 0.503192 PR-AUC: 0.147242 but accuracy: 0.8534 is not bad. It is hard to imagine the model has already been overfitted at Epoch 9. The on_epoch_end callback uses predict_on_batch, which seems fine. I have no clue on what is wrong here.

jstremme commented 2 years ago

@bliu188, can you share all the output from your training run? Also, what's the prevalence of your target variable?

bliu188 commented 2 years ago

Thanks, Joel.

What kind of output will be helpful for you? I added keras.metrics.AUC to check your roc-auc in the implementation. These two agree, ~0.5 auc, suggesting nothing is wrong with auc calculation.

The target is about 20%. I did try to sort the sequence ascending or descending. It appears not to make any difference. But I tested two layers LSTM, which gave 0.8 auc. I also tested Deep Records with 0.75 auc. Just tested GRU with pyTorch_ehr with 0.76 auc. RETAIN model should be competitive based on a recent comparison https://doi.org/10.1016/j.jbi.2019.103337. I have not done any model tuning, but I do not think tuning will make this much difference. The number of epochs did not affect the test auc either. It seems like the training has no improvement at all with more epochs.

Thanks, Bing

On Mar 24, 2022, at 2:04 PM, Joel Stremmel @.***> wrote:

 @bliu188, can you share all the output from your training run? Also, what's the prevalence of your target variable?

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.

tRosenflanz commented 2 years ago

It might be a learning rate issue or something with the data prep since it is a bit more complicated than a simple LSTM.

The ROC calculation should be correct. PR-AUC of an untrained model also equals positive class prevalence so that explains why you are getting around .15

jstremme commented 2 years ago

Thanks, Joel. What kind of output will be helpful for you? I added keras.metrics.AUC to check your roc-auc in the implementation. These two agree, ~0.5 auc, suggesting nothing is wrong with auc calculation. The target is about 20%. I did try to sort the sequence ascending or descending. It appears not to make any difference. But I tested two layers LSTM, which gave 0.8 auc. I also tested Deep Records with 0.75 auc. Just tested GRU with pyTorch_ehr with 0.76 auc. RETAIN model should be competitive based on a recent comparison https://doi.org/10.1016/j.jbi.2019.103337. I have not done any model tuning, but I do not think tuning will make this much difference. The number of epochs did not affect the test auc either. It seems like the training has no improvement at all with more epochs. Thanks, Bing On Mar 24, 2022, at 2:04 PM, Joel Stremmel @.***> wrote:  @bliu188, can you share all the output from your training run? Also, what's the prevalence of your target variable? — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.

I was just curious to see the AUC at each epoch. Typically RETAIN achieves the best validation AUC within less than 10 epochs. You could always try dropping the learning rate to see if that yields any benefits. Hope this helps, and thanks for the paper link!

bliu188 commented 2 years ago

I have tried the learning rate scheduler for adamax, surprisingly giving out a similar auc. The most probable cause is data or how data is loaded. Have not got a chance to verify this yet.

Thanks for the guidance.

Have a great weekend!

Bing

On Mar 25, 2022, at 9:59 AM, Joel Stremmel @.***> wrote:

 Thanks, Joel. What kind of output will be helpful for you? I added keras.metrics.AUC to check your roc-auc in the implementation. These two agree, ~0.5 auc, suggesting nothing is wrong with auc calculation. The target is about 20%. I did try to sort the sequence ascending or descending. It appears not to make any difference. But I tested two layers LSTM, which gave 0.8 auc. I also tested Deep Records with 0.75 auc. Just tested GRU with pyTorch_ehr with 0.76 auc. RETAIN model should be competitive based on a recent comparison https://doi.org/10.1016/j.jbi.2019.103337. I have not done any model tuning, but I do not think tuning will make this much difference. The number of epochs did not affect the test auc either. It seems like the training has no improvement at all with more epochs. Thanks, Bing … On Mar 24, 2022, at 2:04 PM, Joel Stremmel @.***> wrote:  @bliu188, can you share all the output from your training run? Also, what's the prevalence of your target variable? — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.

I was just curious to see the AUC at each epoch. Typically RETAIN achieves the best validation AUC within less than 10 epochs. You could always try dropping the learning rate to see if that yields any benefits. Hope this helps, and thanks for the paper link!

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.

bliu188 commented 1 year ago

Hi Tim,

I followed the steps described in Choi’s paper and the comparison paper. The data format looks correct when I compare it with the sample data generation. Of course, I am not sure if the order of visits will be critical.

You bring up an excellent point on learning rate. I could not find where to change this.

I attached the model summary, printout from the model training, and a plot of metrics with epochs.

Thanks a bunch!

Bing

On Mar 25, 2022, at 12:17 AM, Tim Rosenflanz @.***> wrote:

 It might be a learning rate issue or something with the data prep since it is a bit more complicated than a simple LSTM

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.

Model: "model"


Layer (type) Output Shape Param # Connected to

codes_input (InputLayer) [(None, None, None) 0 []
]

embedding (Embedding) (None, None, None, 85100 ['codes_input[0][0]']
100)

lambda (Lambda) (None, None, 100) 0 ['embedding[0][0]']

dropout (Dropout) (None, None, 100) 0 ['lambda[0][0]']

alpha (Bidirectional) (None, None, 256) 234496 ['dropout[0][0]']

alpha_dense_0 (TimeDistributed (None, None, 1) 257 ['alpha[0][0]']
)

beta (Bidirectional) (None, None, 256) 234496 ['dropout[0][0]']

softmax_1 (Softmax) (None, None, 1) 0 ['alpha_dense_0[0][0]']

beta_dense_0 (TimeDistributed) (None, None, 100) 25700 ['beta[0][0]']

multiply (Multiply) (None, None, 100) 0 ['softmax_1[0][0]',
'beta_dense_0[0][0]',
'dropout[0][0]']

lambda_1 (Lambda) (None, 100) 0 ['multiply[0][0]']

lambda_2 (Lambda) (None, 1, 100) 0 ['lambda_1[0][0]']

dropout_1 (Dropout) (None, 1, 100) 0 ['lambda_2[0][0]']

time_distributed_out (TimeDist (None, 1, 1) 101 ['dropout_1[0][0]']
ributed)

================================================================================================== Total params: 580,150 Trainable params: 580,150 Non-trainable params: 0


/databricks/python/lib/python3.8/site-packages/mlflow/utils/autologging_utils/init.py:410: FutureWarning: Autologging support for keras >= 2.6.0 has been deprecated and will be removed in a future MLflow release. Use mlflow.tensorflow.autolog() instead. return _autolog(args, kwargs) 2022/03/25 02:46:54 INFO mlflow.utils.autologging_utils: Created MLflow autologging run with ID '0e41dfa3f3414bd2b3f9b0a1d36f60b5', which will track hyperparameters, performance metrics, model artifacts, and lineage information for the current keras workflow /databricks/python/lib/python3.8/site-packages/keras/engine/functional.py:1410: CustomMaskWarning: Custom mask layers require a config and must override get_config. When loading, the custom mask layer must be passed to the custom_objects argument. layer_config = serialize_layer_fn(layer) Epoch 1/20 116/784 [===>..........................] - ETA: 6:52 - loss: 0.4687 - tp: 94.0000 - fp: 438.0000 - tn: 24993.0000 - fn: 4171.0000 - accuracy: 0.8448 - precision: 0.1767 - recall: 0.0220 - auc: 0.5044 - prc: 0.146 WARNING: skipped 6984210 bytes of output ***

784/784 [==============================] - 491s 626ms/step - loss: 0.4084 - tp: 0.0000e+00 - fp: 0.0000e+00 - tn: 172376.0000 - fn: 28319.0000 - accuracy: 0.8589 - precision: 0.0000e+00 - recall: 0.0000e+00 - auc: 0.5098 - prc: 0.1450 WARNING:absl:Found untraced functions such as lstm_cell_1_layer_call_fn, lstm_cell_1_layer_call_and_return_conditional_losses, lstm_cell_2_layer_call_fn, lstm_cell_2_layer_call_and_return_conditional_losses, lstm_cell_4_layer_call_fn while saving (showing 5 of 20). These functions will not be directly callable after loading. INFO:tensorflow:Assets written to: /tmp/tmp45e_x6yw/model/data/model/assets

INFO:tensorflow:Assets written to: /tmp/tmp45e_x6yw/model/data/model/assets

/databricks/python/lib/python3.8/site-packages/keras/engine/functional.py:1410: CustomMaskWarning: Custom mask layers require a config and must override get_config. When loading, the custom mask layer must be passed to the custom_objects argument. layer_config = serialize_layer_fn(layer) /databricks/python/lib/python3.8/site-packages/keras/saving/saved_model/layer_serialization.py:112: CustomMaskWarning: Custom mask layers require a config and must override get_config. When loading, the custom mask layer must be passed to the custom_objects argument. return generic_utils.serialize_keras_object(obj) WARNING:absl:<keras.layers.recurrent.LSTMCell object at 0x7feb477cfa60> has the same name 'LSTMCell' as a built-in Keras object. Consider renaming <class 'keras.layers.recurrent.LSTMCell'> to avoid naming conflicts when loading with tf.keras.models.load_model. If renaming is not possible, pass the object in the custom_objects parameter of the load function. WARNING:absl:<keras.layers.recurrent.LSTMCell object at 0x7feb477eec70> has the same name 'LSTMCell' as a built-in Keras object. Consider renaming <class 'keras.layers.recurrent.LSTMCell'> to avoid naming conflicts when loading with tf.keras.models.load_model. If renaming is not possible, pass the object in the custom_objects parameter of the load function. WARNING:absl:<keras.layers.recurrent.LSTMCell object at 0x7feb477ad250> has the same name 'LSTMCell' as a built-in Keras object. Consider renaming <class 'keras.layers.recurrent.LSTMCell'> to avoid naming conflicts when loading with tf.keras.models.load_model. If renaming is not possible, pass the object in the custom_objects parameter of the load function. WARNING:absl:<keras.layers.recurrent.LSTMCell object at 0x7feb477adbe0> has the same name 'LSTMCell' as a built-in Keras object. Consider renaming <class 'keras.layers.recurrent.LSTMCell'> to avoid naming conflicts when loading with tf.keras.models.load_model. If renaming is not possible, pass the object in the custom_objects parameter of the load function. 2022/03/25 05:32:13 WARNING mlflow.utils.environment: Encountered an unexpected error while inferring pip requirements (model URI: /tmp/tmp45e_x6yw/model, flavor: keras), fall back to return ['tensorflow==2.7.0', 'keras==2.7.0']. Set logging level to DEBUG to see the full traceback. 2022/03/25 05:32:18 WARNING mlflow.utils.environment: Encountered an unexpected error while inferring pip requirements (model URI: /dbfs/mnt/medicalaffairs/users/bl10520/RETAIN300/Model, flavor: keras), fall back to return ['tensorflow==2.7.0', 'keras==2.7.0']. Set logging level to DEBUG to see the full traceback.