jaredleekatzman / DeepSurv

DeepSurv is a deep learning approach to survival analysis.
MIT License
566 stars 166 forks source link

Error "NaNs detected in inputs, please correct or drop" #76

Closed zhxiaokang closed 2 years ago

zhxiaokang commented 2 years ago

I'm trying out DeepSurv on my dataset following the notebook. While running the command

metrics = model.train(train_data, n_epochs=n_epochs, logger=logger, update_fn=update_fn)

I got the error:

ValueError                                Traceback (most recent call last)
<ipython-input-32-4f4ac645f588> in <module>
      3 # If you have validation data, you can add it as the second parameter to the function
----> 4 metrics = model.train(train_data, n_epochs=n_epochs, logger=logger, update_fn=update_fn)

/opt/miniconda3/lib/python3.8/site-packages/deepsurv/deep_surv.py in train(self, train_data, valid_data, n_epochs, validation_frequency, patience, improvement_threshold, patience_increase, logger, update_fn, verbose, **kwargs)
    431             # train_loss.append(loss)
--> 433             ci_train = self.get_concordance_index(
    434                 x_train,
    435                 t_train,

/opt/miniconda3/lib/python3.8/site-packages/deepsurv/deep_surv.py in get_concordance_index(self, x, t, e, **kwargs)
    310         partial_hazards = compute_hazards(x)
--> 312         return concordance_index(t,
    313             partial_hazards,
    314             e)

/opt/miniconda3/lib/python3.8/site-packages/lifelines/utils/concordance.py in concordance_index(event_times, predicted_scores, event_observed)
     90     """
---> 91     event_times, predicted_scores, event_observed = _preprocess_scoring_data(event_times, predicted_scores, event_observed)
     92     num_correct, num_tied, num_pairs = _concordance_summary_statistics(event_times, predicted_scores, event_observed)

/opt/miniconda3/lib/python3.8/site-packages/lifelines/utils/concordance.py in _preprocess_scoring_data(event_times, predicted_scores, event_observed)
    299     for a in [event_times, predicted_scores, event_observed]:
    300         if np.isnan(a).any():
--> 301             raise ValueError("NaNs detected in inputs, please correct or drop.")
    303     return event_times, predicted_scores, event_observed

ValueError: NaNs detected in inputs, please correct or drop.

It seems that I have NaNs in my input. But when I looked into my dataset:


There isn't any NaN (I only posted "x" here, but no problem with "e" or "t" neither).

zhxiaokang commented 2 years ago

Problem: there are all-zero columns. Solution: dropping those columns out.