openghg / openghg_inversions

University of Bristol Atmospheric Chemistry Research Group RHIME Inversion code (with openghg dependency)
MIT License
5 stars 0 forks source link

min_error issues umbrella #203

Open qq23840 opened 2 months ago

qq23840 commented 2 months ago

After our discussion about min_error on Wednesday, I thought I would collate some of the issues here. So far my thoughts are:

Feel free to add other issues with min_error to this one - just thought I'd summarise here.

hdelongueville commented 2 months ago

Thanks for opening this issue Ben! I'd like to add that maybe a good way to add the min_error to the rhime output would be as a variable with the dimension (nsite), instead of (nmeasure) and also not as an attributes.

qq23840 commented 2 months ago

Maybe a 'min_error' variable with dimension nsite that shows the input min error, and a 'model error' variable with dimensions (nsite, nmeasure) that is what error values are actually passed to the inversion for each site (which is the larger of the min_error and the pollution-scaled error) ?

brendan-m-murphy commented 2 months ago

Thanks Ben!

I'm working on incorporating some the PARIS formatting code into inversions, which I think will help with this. (E.g. in PARIS formatting, epsilon is used explicitly in the outputs.)

min_error was originally a single value, so I put it in the attributes of the output (since we put info about the priors and so on there). If you specify min_error and make calculate_min_error = None, that is still the case. Otherwise, min_error is actually an array, but still stored in the attributes of the output. It turned out that this worked with the PARIS formatting code without having to modify anything, so I left it as is, since the long term plan was to move a lot of that code into inversions.

So, if you use calculate_min_error, then the min_error attribute will actually be a vector with the same length as, say Y (so could have nmeasure as coordinates).

Going forward, reporting min_error as a data variable, and reporting epsilon sounds like a good plan.

brendan-m-murphy commented 2 months ago

Also a warning for passing both min_error and calculate_min_error seems like a good idea. Alternatively, we could just have one parameter min_error in the ini, that could either be a float value, or "percentile", or "residual", and would either use the value provided, or calculate a value based on the specified method (this would probably mean fixedbasisMCMC would need minor changes).

qq23840 commented 2 months ago

I think having a single parameter would be far more elegant for min_error