havakv / pycox

Survival analysis with PyTorch
BSD 2-Clause "Simplified" License
780 stars 180 forks source link

ValueError: cannot convert float NaN to integer #173

Open Aniket99coder opened 7 months ago

Aniket99coder commented 7 months ago

The below code form the examples notebook of CoxTime works when pandas=1.3.5 but fails otherwise

%%time
log = model.fit(x_train, y_train, batch_size, epochs, callbacks, verbose,
                val_data=val.repeat(10).cat())
/opt/conda/lib/python3.10/site-packages/pycox/models/data.py:90, in CoxTimeDataset.__getitem__(self, index)
     88     index = [index]
     89 durations = self.durations_tensor.iloc[index]
---> 90 case, control = super().__getitem__(index)
     91 case = case + durations
     92 control = control.apply_nrec(lambda x: x + durations)

File /opt/conda/lib/python3.10/site-packages/pycox/models/data.py:73, in CoxCCDataset.__getitem__(self, index)
     71 fails = self.durations.iloc[index]
     72 x_case = self.input.iloc[fails.index]
---> 73 control_idx = sample_alive_from_dates(fails.values, self.at_risk_dict, self.n_control)
     74 x_control = tt.TupleTree(self.input.iloc[idx] for idx in control_idx.transpose())
     75 return tt.tuplefy(x_case, x_control).to_tensor()

File /opt/conda/lib/python3.10/site-packages/pycox/models/data.py:18, in sample_alive_from_dates(dates, at_risk_dict, n_control)
     16 idx = (np.random.uniform(size=(n_control, dates.size)) * lengths).astype('int')
     17 samp = np.empty((dates.size, n_control), dtype=int)
---> 18 samp.fill(np.nan)
     20 for it, time in enumerate(dates):
     21     samp[it, :] = at_risk_dict[time][idx[:, it]]

ValueError: cannot convert float NaN to integer
shahidhaider-altis commented 7 months ago

This issue comes up from the fact that samp was an integer array originally, so when you try and fill it with a float, it complains.

This is something you could do in previous numpy versions, so I have found that by pinning your numpy version to 1.23.5, the code will work