jaredleekatzman / DeepSurv

DeepSurv is a deep learning approach to survival analysis.
MIT License
566 stars 166 forks source link

compute_hazards does not work for loaded models #47

Open allendorf opened 5 years ago

allendorf commented 5 years ago

I was trying to understand why the concordance index was inconsistent when calculating from a loaded model (see this comment here) and discovered this problem: model.X and model.partial_hazard are not correctly saved in save_model.

How to replicate:

  1. Run DeepSurv Example jupyter notebook
  2. Save the trained model with model.save_model('bestparams.json', weights_file='bestweights.h5')
  3. Load the model with model2 = deepsurv.deep_surv.load_model_from_json('bestparams.json', weights_fp='bestweights.h5')
  4. In one cell, run the following repeatedly. Notice that with the original model, partial_hazards is consistent.
compute_hazards = theano.function(inputs = [model.X],outputs = -model.partial_hazard)
partial_hazards= compute_hazards(x_train)
print(partial_hazards)
  1. In another cell, run the same code with the saved and then reloaded model. Notice that the value of partial_hazards changes with each run of the cell.
model2 = deepsurv.deep_surv.load_model_from_json('bestparams.json', weights_fp='bestweights.h5')
compute_hazards = theano.function(inputs = [model2.X],outputs = -model2.partial_hazard)
partial_hazards2 = compute_hazards(x_train2)
print(partial_hazards2)

This results in an inconsistent concordance index, since you must compute the hazards to get the concordance. I have tried to fix it myself but I don't think I understand Theano tensors well enough. Hope this can be addressed soon.