Closed adamcpovey closed 3 months ago
On the general point about using linear values (that are log10'd under the hood) - I think this is a nice intuitive change. In my code, I have comments reminding me that if I want to set apriori tau to 60, I need to use 1.77815 (10^1.77815 = 59.9998). This change will make reading and interpreting input values more intuitive. Obviously this could turn into a bit of a 'gotcha' if people miss this change and don't modify their input files appropriately.
Seems OK to me to be consistent and make the same change for Sx. Side note: For cloud, we have Ctrl%Sx(IRe)=1e8
and Ctrl%Sx(ITau)=1e8
as the default setting in read_driver
. Does that mean we are saying apriori uncertainty in COT is 10^1e8? Should these very large numbers be reviewed? Will they cause issues with matrix inversions?
Yes I think it makes sense to put limit checking (and the log10 conversions) in set_limits
.
Certainly for a new user, using the new LUTs, it makes sense to have all inputs in linear space.
Applying the change to Sx is also sensible for consistency, but not for clarity due to the non-linearity. Not only will changing the a priori also change the actual value of Sx, but it is not obvious what a "linear value of Sx" when the retrieval is in log space actually means. For example, if we say we have an apriori of x=12.0 with a linear uncertainty (sigma) of 0.25, we might naively say:
IDL> x = 12.0
IDL> lx = alog10(x)
IDL> print, x, lx
12.0000 1.07918
IDL> sx = 0.25^2 & lsx = alog10(sx)
IDL> print, sx, lsx
0.0625000 -1.20412
This clearly doesn't make any sense at all (the actual uncertainty ranges are completely different). I'd be interested in seeing how you'd suggest defining Sx in linear space before giving a thumbs up. Having the prior values specified in linear space, while the Sx values are specified in log10-space is even more confusing than having everything specified in log10-space, so it's a bit of a problem.
As for the location of the changes, I think get_spixel
would be fine for the conversion calculations, with set_limits
just doing what its name suggests, especially if putting the conversion in set_limits would require a more major restructuring of the code.
@garethethomas The question about how to pass the uncertainty is why I didn't commit anything. The maths of OE mean they errors have to be Gaussian. Hence, the options are:
Ctrl%Sx
regardless of underlying context. We all appear to agree that this is misleading.Ctrl%Sx
for linear uncertainties and add Ctrl%Sx_log
for logarithmic uncertainties. This makes the code easier to read but still requires the user to know the structure of the LUT.Ctrl%Sx
through $\sigma_{\log} = \frac{\sigma_x}{x \ln 10}$ to calculate the equivalent log error.3 isn't perfect, because linear error bars aren't even in log space but they're fairly close for the values we're talking about. For example,
import math
x, sx = 12.0, 0.25
y = math.log10(x)
sy = sx / x / math.log(10.)
print(y, sy)
#>>> 1.07918 0.00904780
print(10.**(y+sy) - x, x - 10.**(y-sy))
#>>> 0.252622 0.247413
@andyprata Sx = 1e8
is our way of saying "unconstrained", with an equal prior for any values between the limits permitted by the retrieval. This isn't rigorous OE (there are other methods for bounded but unconstrained minimisation) but it works. I don't believe it's a problem for matrix conditioning because the inversion of Sx is fairly simple.
both I'll move things into get_spixel
and commit that later. It's not hard. Just makes a lot of noise.
Ok, I think method 3 will work fine for Ctrl%Sx
.
I guess we should also put some thought into other methods of defining the a priori (and/or first guess), even though they aren't generally used at the moment - I guess most options (like AUX
or whatever) would have to expect log values directly.
Actually, this raises a point that I've missed about the new LUTs up till now. With the old LUTs there was a ORAC_Class_config.sad file, which allowed you to set a per-phase (or -class) a priori values/uncertainties for effective radius and optical depth, using the SAD
method for defining the X0
, XB
etc. This was the standard method for aerosol retrievals - how is this done with the new LUTs? (Of course, the scaling method 3 would also probably be fine for dealing with Sx from such data as well)
I have never encountered a ORAC_Class_config.sad file. Setting a priori values for aerosol types is what all the crap at the bottom of https://github.com/ORAC-CC/orac/blob/master/tools/pyorac/definitions.py#L658-L669 is for.
@garethethomas @andyprata How's that?
There is slightly more urgency to this PR as I just noticed that, since Don changed which axes are log/linear in the netCDF tables, we'll be outputting the wrong values for those tables.
I'm now reasonable certain that this PR doesn't break the old tables. Next time I shall check that it fixes the new ones...
Plots comparing cloud retrieval for (left) current repository code using a netCDF LUT, (centre) this PR using a netCDF LUT, (right) current repository code using a text LUT for (rows) eight regression test granules [four day, four night, two Terra, two Aqua, two Sentinel-3 A then B]. Uses ann_phase
to manually post-process; white is clear sky or did-not-attempt.
This PR doesn't change the output for netCDF or text, but doesn't do much for the differences between them. Apologies for the worst colourbar - I swear everything else looked even more rubbish.
In the text LUTs, the optical depth axis is logarithmically spaced and the values written in the tables are the log of the actual value (e.g. the first COT value is -20). As such, all references to optical depth in the state vector must also be in log space (e.g. the prior for aerosol is -1.5). The effective radius is logarithmically spaced for aerosol tables but linearly spaced for cloud. This is confusing.
The NCDF LUTs specify if axes are logarithmically or linearly spaced as an attribute. This commit adds a flag to the LUT Grid structure indicating if an axis is logarithmic (which is detected by negative values in the text tables). The set values are then log-ed as required.
I have a few questions for our active developers:
Sx
, the a priori uncertainty? On the one hand, the transform is non-linear so a user might not realise that changing the COT prior will also implicitly change the apriori uncertainty. On the other hand, the user won't know thatCtrl%Sx(IRe) = 0.1
means different things for cloud and aerosol retrievals.get_spixel()
but it makes sense going in the same place asset_limits()
. However, moving that routine makes a lot of noise I wanted to avoid in the first instance.