eco4cast / EFIstandards

Exploring possible metadata and data formatting standards for comparing Ecological Forecasts
BSD 2-Clause "Simplified" License
14 stars 8 forks source link

Minor test inconsistency in forecast_validator #30

Closed lzachmann closed 3 years ago

lzachmann commented 3 years ago

The checks embedded in the function forecast_validator() do not appear to accept the MCMC type propagation of initial condition uncertainty: Error: 'initial_conditions' Invalid uncertainty <propagation> <type> 'MCMC'. Was curious to know if this is expected or unexpected behavior? I can force a passing test by using type: ensemble. Resolving this (hopefully minor) issue should allow me to create and upload the metadata files corresponding to each of my phenology forecasts (related issue here: https://github.com/eco4cast/neon4cast/issues/1#issuecomment-822734744).

cboettig commented 3 years ago

Great question. I believe ensemble would indeed be the correct description of how uncertainty is represented (e.g. in MCMC), while MCMC is the algorithm by which you are doing the propagation.

According to the current standard (https://doi.org/10.32942/osf.io/9dgtq), the type subproperty for <propegation> describes how the uncertainty is represented, i.e. either ensemble / analytic. The type property of <assimilation> meanwhile specifies the algorithm. So according to the standard, the behavior you see is correct, but I agree that the different use of of type on propagation vs assimilation does seem inconsistent to me as well. @mdietze can clarify if I'm mistaken?

Keep in mind this is a first draft, and terms are not a precise semantic ontology at this point!

mdietze commented 3 years ago

I'd concur with Carl's take on why "ensemble" is the correct type for <propagation>, why "MCMC" is a valid type for <assimilation>, and that this is not a precise semantic ontology yet.

In defense of how things are set up, there are a bunch of different valid assimilation methods beyond MCMC that all use an ensemble/Monte Carlo approach to uncertainty propagation (e.g. EnKF, PF, SMC, EnVar), and that you can use ensemble/Monte Carlo approaches to propagation even when you're not doing any sort of assimilation.

lzachmann commented 3 years ago

Makes perfect sense, thanks guys. My apologies for needing additional clarification here. The notes in my likely super out of date template say something like #How does your model propogate XYZ (ensemble or MCMC is most common) where XYZ is the uncertainty component. I see now that it should be ensemble across the board (in my case). Thanks for such speedy replies!