Open qq23840 opened 2 months ago
Thanks for opening this issue Ben! I'd like to add that maybe a good way to add the min_error to the rhime output would be as a variable with the dimension (nsite), instead of (nmeasure) and also not as an attributes.
Maybe a 'min_error' variable with dimension nsite that shows the input min error, and a 'model error' variable with dimensions (nsite, nmeasure) that is what error values are actually passed to the inversion for each site (which is the larger of the min_error and the pollution-scaled error) ?
Thanks Ben!
I'm working on incorporating some the PARIS formatting code into inversions, which I think will help with this. (E.g. in PARIS formatting, epsilon
is used explicitly in the outputs.)
min_error
was originally a single value, so I put it in the attributes of the output (since we put info about the priors and so on there). If you specify min_error
and make calculate_min_error = None
, that is still the case. Otherwise, min_error
is actually an array, but still stored in the attributes of the output. It turned out that this worked with the PARIS formatting code without having to modify anything, so I left it as is, since the long term plan was to move a lot of that code into inversions.
So, if you use calculate_min_error
, then the min_error
attribute will actually be a vector with the same length as, say Y
(so could have nmeasure
as coordinates).
Going forward, reporting min_error
as a data variable, and reporting epsilon
sounds like a good plan.
Also a warning for passing both min_error
and calculate_min_error
seems like a good idea. Alternatively, we could just have one parameter min_error
in the ini, that could either be a float value, or "percentile", or "residual", and would either use the value provided, or calculate a value based on the specified method (this would probably mean fixedbasisMCMC
would need minor changes).
I think having a single parameter would be far more elegant for min_error
After our discussion about min_error on Wednesday, I thought I would collate some of the issues here. So far my thoughts are:
min_error
and alsocalculate_min_error
. I did this by mistake today, and they produce conflicting behaviour -calculate_min_error
overridesmin_error
, but it's not clear this is what's happening when you pass both in the .ini filemin_error = 10.0
or similar is passed into the .ini file is as expected. The problem I've had is with verifying this : themin_model_error
variable and attribute are just the single value passed (or calculated by the residual/percentile method) repeatednmeasure
times. I think it would be more informative to have the output variable report 'epsilon', which is the error actually passed to the likelihood in L310 ofinversion_pymc.py
. This does vary with time, although in practice if a largemin_error
is specified it will default to this most of the time.Feel free to add other issues with min_error to this one - just thought I'd summarise here.