Predict method with unbalanced data

helske commented 2 years ago

In dynamite, the data is expanded to include missing time points, whereas in predict the newdata was taken as is, which caused issues due to different number of ids per time index and incorrect handling of gaps in the data. I added fill_time_predict function and call to this in parse_newdata in a2fc06e86759a0f0f23a854721db63b888605394, but now in the case when newdata is NULL, clear_nonfixed will set some groups full of NAs the starting time in the original data was not the first time point in the data. I first thought this was a bug, but this does work as designed. But in predict, perhaps a better way would be to only fill the gaps for each individual, and not expand beyond the start and end time of each individual so we would still get predictions for everyone (Although I can see this being bit confusing in some cases). This would need some changes at least to generate_sim_call.

helske commented 2 years ago

Now that I thought more about this, perhaps the current behaviour is exactly what we want, as in this other option fixing the start time would vary between individuals which would make interpretation of the aggregate predictions (over individuals) quite confusing.

santikka commented 2 years ago

This was actually completed in eb2fb5ca43fdf8cf7d9b205b8793fa2178ac87d6 by adding the argument global_fixed to predict which allows for both individual level and global interpretation of the fixed time points.

ropensci / dynamite

Predict method with unbalanced data #42