weecology / portalPredictions

Using various models to forecast abundances at Portal
MIT License
9 stars 8 forks source link

decide where to locate model evaluation in the pipeline #359

Open juniperlsimonis opened 4 years ago

juniperlsimonis commented 4 years ago

@henrykironde and @ethanwhite I welcome your feedback on an infrastructure thing that we haven't really talked about yet, which is where/how to locate the model evaluation code within the pipeline.

the current situation is that we basically do model evaluation on-the-fly within the plotting functions that are run inside the markdown document (which is rendered as the last step in PortalForecasts.R) and we don't store the evaluations long-term.

We're doing this because it's quick and easy to calculate them as we are doing right now.

however, that won't be the case for much longer as we use more complex evaluation methods on many more models...we're going to want to have the evaluations done and ready when needed.

we will always need to evaluate previous forecasts, a process which is naturally decoupled from the forecasting itself (we can't evaluate the forecast that we're making right now because we don't have the data yet!).

so, I'm wondering if this means that we should consider branching out a secondary component of the automation for evaluation, or what would be the best way to handle this.

we don't need to make any decisions right now on this, and I still need to write up a bunch of code before the evaluations are ready to run locally, but I was just thinking about how "evaluation" needs to be elevated within the portalcasting codebase (and documents! those functions aren't in the codebase vignette!) and this made me realize we might be needing to realign things at the infrastructure level associated with this.

so yeah, let me know what you think. thanks!

ethanwhite commented 4 years ago

@juniperlsimonis & I chatted about this in staff meeting yesterday and my general thought was that evaluation should be separated from running of the forecasts. The optimal scenario would be to run evaluations each time new rodent data is added to PortalData, but @juniperlsimonis and I both agreed that for now we could still do this at some regular frequency instead of trying to move to a fully event driven system.