liupei101 / TFDeepSurv

COX Proportional risk model and survival analysis implemented by tensorflow.
MIT License
99 stars 27 forks source link

How to test this with a new data set sample? #1

Closed rmaanyam closed 5 years ago

rmaanyam commented 5 years ago

Hi Liupei101,

This package looks great. Wondering if we cud play/test this package with a new sample data set. If ok, can you please guide thru initial steps to start with - newbie pls..thanks...or wud like to discuss a couple of questions - u may pls advise about the best way to contact/communicate, thanq so much, appreciate ur time/help...

liupei101 commented 5 years ago

Hi, thx for your comment.

The package is available for running with real data set.

I have added extra descriptions of doing deep cox model on real data: TFDeepSurv#42-runing-with-real-data.

Feel free to communicate with us if there is any problem !

rmaanyam commented 5 years ago

Cool, thankQ - that wud be very helpful... sure, will check out and get back, as needed.

rmaanyam commented 5 years ago

Moved forward with load_data module - but encountered an optimization error with sample data - per below. If you have any clue about this error, pls advise. [Also, with all the the 4.1 steps, simulated data resulted lower ci, for example: training steps 2401, loss = 6.88694, CI = 0.681214. Not sure, why? Any clues, please advise, appreciate ur help]..Perhaps, below error is abt version issue -will check if tf version-update helps...

model.train(num_epoch=2500, iteration=100, ... plot_train_loss=True, plot_train_ci=True) time_stamp.../tensorflow/core/grappler/optimizers/graph_optimizer_stage.h:237] Failed to run optimizer ArithmeticOptimizer, stage HoistCommonFactor. Error: Node ArithmeticOptimizer/AddOpsRewrite_add_543 is missing output properties at position :0 (num_outputs=0)

liupei101 commented 5 years ago

Thx for your reply! As you mentioned above, the error below occurred. The main reason may be that you used an early version of the tfdeepsurv package, so pls download the latest version, reinstall and then check it.

TypeError: load_data() got an unexpected keyword argument 'excluded_col'.

Okay, it's great! seems that you have fixed the type error.

If you get lower ci with simulated data, pls check whether the arguments you pass to dsnn is the same with #4.1. And I will try to running with simulated data again, and then check it.

Also, an optimization error with sample data may be due to missing the statement of output_nodes = 1. Since I don't know your code for running dsnn with sample data, so I can only to guess the reason with the help of error message.

liupei101 commented 5 years ago

Oh, I have run the code in #4.1 with simulated data, and get the same result with u!

I'm sry! I think that I must be set a different random seed instead of a default seed, but it is not represented in #4.1.

But it is not the key point! In a general setting, if a lower metric occurs, we often to do hyper-parameters tuning when running DSNN with a new dataset.

rmaanyam commented 5 years ago

sry, closed by mistake:-)

rmaanyam commented 5 years ago

Thank you so much for your help/feedback. Its very helpful. You may pls look thru below code/output for reference, and see if anything missing...

from tfdeepsurv import dsl from tfdeepsurv.utils import load_data train_X, train_y, test_X, test_y = load_data('data_1.csv', excluded_col=['ID'], surv_col={'e': 'event', 't': 'time'}, split_ratio=0.8) Number of rows: 5417 X cols: 10 Y cols: 2 X.column name: Index(['Var1', 'Var2', 'Var3', 'Var4', 'Var5', 'Var6', 'Var7', 'Var8', 'Var9', 'Var10'], dtype='object') Y.column name: Index(['time', 'event'], dtype='object') input_nodes = 10 output_nodes = 1 model = dsl.dsnn(train_X, train_y, input_nodes, [6, 3], output_nodes, learning_rate=0.2, learning_rate_decay=1.0, activation='relu', L1_reg=0.0002, L2_reg=0.0003, optimizer='adam', dropout_keep_prob=1.0) print(model.get_ties()) efron model.train(num_epoch=1000, iteration=100, plot_train_loss=True, plot_train_ci=True) 2019-02-02 18:41:36.554331: W ./tensorflow/core/grappler/optimizers/graph_optimizer_stage.h:237] Failed to run optimizer ArithmeticOptimizer, stage HoistCommonFactor. Error: Node ArithmeticOptimizer/AddOpsRewrite_add_543 is missing output properties at position :0 (num_outputs=0)

training steps 1: loss = 8.55276.

CI = 0.490029.


training steps 101: loss = 8.55249.

CI = 0.571747.


training steps 201: loss = 8.55219.

CI = 0.579078. ........................


training steps 801: loss = 8.46366.

CI = 0.615219.


training steps 901: loss = 8.46141.

CI = 0.616758.


My observations:

  1. Optimizer error may be related to an issue with tensor flow version [may be not due to missing out put nodes, becoz it produces result, event after the error]. Appreciate your help in clarifying this observation.

Questions for you

  1. Do you think the above optimizer error has any impact on lower 'ci' values [although, the code ran with default hyper parameters]?

  2. Now, the key point is - I wud need to run 'hypopt' for training the model with the above data set (or any new one) to get good parameters and use those values in the code bloc [ model = dsl.dsnn (train_X, train_Y....], right? So, for this task, can you kindly guide me thru necessary steps to run, pls?

You may list-down the steps/details, pls [newbie to this hypopt:) - or u can email me at 'rmanyam at student.gsu.edu' ..or u may list-down steps in the notebooks repository folder or so, if that works better. Hope you can look thru this request when u get a chance. Thank you so much, appreciate ur help...

liupei101 commented 5 years ago

Reply to your observations: Since the result produced, even after the optimizer error, so the issue with tensorflow version may be needed to check. And also, I will first deal with the optimizer error, then answer the first one of Questions for me. (By the way, the tensorflow version on my pc is 1.4.0. What about you ?)

I will reinstall a new version of tensorflow, and retest all functions in this package.

About hyper-parameters tuning. Good suggestion! For hyper-parameters tuning on simulated data, I will list-down the steps/details in [bysopt]. (https://github.com/liupei101/TFDeepSurv/blob/master/bysopt/README.md)

@rmaanyam My schedule is pretty tight during this month. The To-do list mentioned above may be done in an idle time period. If you want to speed things up, then you could read the implementation in hpopt.py yourself and try something similar.

rmaanyam commented 5 years ago

Sure, thank you. Let me check on the 1) tensor flow version on my pc too and upgrade, as needed and will then double-check if those steps help in improving performance. 2) will look thru bysopt at the link u mentioned and go from there. 3) I will also look thru the hpopt.py.... Sure, np - whenever u get a chance (early next month or as soon as u can, pls), any guidance in this regard is very helpful, please. Thank you so much...

liupei101 commented 5 years ago

The package TFDeepSurv has been updated.

Fix or add:

A comprehensive test has been done on my PC under the environment described in README.

Feel free to contact me if there is something wrong!

liupei101 commented 5 years ago

@rmaanyam By the way, would u mind to star or fork this repo so as to let it known to more people? ^_^

Thx!

rmaanyam commented 5 years ago

Sounds great, thank you so much, liupei101! Will be checking/testing soon for sure. Wow, I think, its great update and should be very helpful to whoever wants to use hyper-parameter tuning/optimization piece, so as to obtain better prediction accuracies...

rmaanyam commented 5 years ago

Hi liupei101, Good news, above update on 'simulated data' piece has been tested successfully and the results exactly match with that of section 4.1 - as you mentioned. Will check with real data set soon and get back if any issues...

rmaanyam commented 5 years ago

Hi liupei101, Good news, got it working with real-dataset too. May need a bit of hyper-parameter tuning now -can go ahead and follow guidelines at bysopt - as you mentioned.

By the way, a question on plots for you - how about putting the curves of both Training and validation data sets on one plot itself?

For instance, both training and validation C-index on one plot and both train and valid losses on another -so as to make it easy for comparison purposes, like you have it for survival function plot here. Or for example, like they have it in DeepSurv here

I'm talking about, a sample output images like below ones...hope you got it.I think, we need to add/update a plot-function in utils or vision.py module. Appreciate your thoughts/help on this. Please let me know if you have any questions for me.

train_data_1 csv,valid_data_1 csv,epochs1000,Figure2

rmaanyam commented 5 years ago

Hi liupei101, Is there anyway we can plot both train and valid CIs on one graph - suggestions please? Can you pls take a look when u get chance...

liupei101 commented 5 years ago

Get it! Sry! My schedule is tight in recent days. I will realize the plot function you mentioned. Maybe done in two days.

liupei101 commented 5 years ago

The plot function has been implemented. More refer to readme.

Update the package and try it. Feel free to tell us if anything gets wrong!

Thx for your suggestions!