Closed adam-haber closed 6 years ago
This is not intentional - but is not a scenario I am explicitly testing for. I will add this. Thanks for the head's up.
One question - did you transform your data using prep_data_long_surv
or are your data already in start-stop / long / denormalized format? This will help me narrow down the possible locations of the problem.
The data wasn't in a long format; I just had an "event" column (some of it censored) and a "time" column.
Great, so that helps a lot. This is still an issue I'll want to catch & fix, but to start with you should first transform your data to "long" format in order to fit the PEM model.
This would be a two-step process, like so:
dlong = survivalstan.prep_data_long_surv(df=d, event_col='event', time_col='t')
fit = survivalstan.fit_stan_survival_model(
model_code = survivalstan.models.pem_survival_model,
df = dlong,
sample_col = 'index',
timepoint_end_col = 'end_time',
event_col = 'end_failure',
formula = '~ age_centered + sex'
)
You may very well still run into the int/float and boolean/int problems you noted above, but noting this here since it came up.
Linking to related issue / recommendation #64 since that would make the need for a two-step process somewhat obsolete
Following the PEM example with my own data, I got unreasonable results using
survivalstan.utils.plot_observed_survival
.After casting my time column to float (was int before) and the event column to boolean (was 0/1 before), everything worked.
Is this intentional?