Closed tdcwilliams closed 1 year ago
@malmans2 , here is the updated evaluation notebook cmip6_eval_siconc.ipynb.zip
latest problems
Hi @malmans2 this is the latest version of the calculation of the error statistics
Hi @tdcwilliams,
Just a quick recap.
sic_model = ... # SIC from model
sic_sat = ... # SIC from satellite
cell area = 25 ** 2 # km^2
model_ice = sic_model > sic_threshold
sat_ice = sic_sat > sic_threshold # monthly mean
good_pixels = np.isfinite(sic_model) * np.isfinite(sic_sat) # get common area that isn't masked (eg not land)
over = good_pixels * model_ice * (~sat_ice) # model overestimation
under = good_pixels * (~model_ice) * sat_ice # model underestimation
iiee = (over.sum() + under.sum()) * cell_area
extent_bias = (over.sum() - under.sum()) * cell_area
sic_sat
should be the mean of the 2 products? I.e., do you expect iiee and extend_bias to have dimensions ("model", "time", "satellite")
or just ("model", "time")
This is how I would code it, let's see if it fixes the memory issues you've encountered.
import numpy as np
import xarray as xr
def make_random_data():
size = (90, 180, 12)
dims = ("lat", "lon", "time")
sic = xr.DataArray(np.random.uniform(low=0, high=100, size=size), dims=dims)
mask = xr.DataArray(np.random.choice([True, False], size=size), dims=dims)
return sic.where(mask)
sic_model = make_random_data()
sic_sat = make_random_data()
def mattia(sic_model, sic_sat, sic_threshold=30, cell_area=25**2):
dims = ("lat", "lon")
over = (sic_model > sic_threshold) & (sic_sat <= sic_threshold)
over = over.sum(dims)
under = (sic_model <= sic_threshold) & (sic_sat > sic_threshold)
under = under.sum(dims)
return xr.Dataset({"iiee": over + under, "extent_bias": over - under}) * cell_area
def tim(sic_model, sic_sat, sic_threshold=30, cell_area=25**2):
dims = ("lat", "lon")
model_ice = sic_model > sic_threshold
sat_ice = sic_sat > sic_threshold # monthly mean
good_pixels = np.isfinite(sic_model) * np.isfinite(sic_sat) # get common area that isn't masked (eg not land)
over = good_pixels * model_ice * (~sat_ice) # model overestimation
under = good_pixels * (~model_ice) * sat_ice # model underestimation
iiee = (over.sum(dims) + under.sum(dims)) * cell_area
extent_bias = (over.sum(dims) - under.sum(dims)) * cell_area
return xr.merge([iiee.rename("iiee"), extent_bias.rename("extent_bias")])
xr.testing.assert_identical(tim(sic_model, sic_sat), mattia(sic_model, sic_sat))
Hi @malmans2, thanks - what you have written looks good for the IIEE and extent bias. Should I try it out or will you? Ciao, Tim
Hi again @malmans2, I would do both satellites, but would do them separately (one at a time), so dims would be ("time", "model"). Tim
Hi again @malmans2, In my notebook I also had some extra statistics which you could try to add once IIEE and extent bias are working. Tim
Hi @tdcwilliams,
Yes, I noticed the extra statistics yesterday. I've been running some test overnight and I'm about to check the results. I should be able to send you a notebook to test later today.
Thanks a lot @malmans2
In the meantime, this is what I got for 10 years in ERA5. Let me know if there's anything that is obviously wrong:
Hi @malmans2 - area bias looks pretty big (could be units?)
Right! My bad, should be divided by 100!
Hi @tdcwilliams,
I've cached 10-years. You should have enough data to check if the code is correct and you have all the data to produce you analysis. If everything looks good, I'll cache the whole timeseries.
I think the only thing I changed is the masking you were doing. The way it's set up now, I don't think we need to apply an additional masking because there's NaNs in the original data.
I'm not sure why, I only get a memory error with mpi_esm1_2_hr
. I'll try to figure out what's wrong later today.
Here is the template: https://github.com/bopen/c3s-eqc-toolbox-template/blob/main/notebooks/wp4/cmip6_sea_ice_evaluation.ipynb Here is the notebook executed: https://gist.github.com/malmans2/3db833049d5bed834759113179ddb8ce
Thanks @malmans2, I'll try to get onto it later today or Monday.
Hi @malmans2, it runs OK and is mostly looking OK - I just noticed the monthly observation error is the linear mean in time when it should be the RMSE. Cheers, Tim
Hi @tdcwilliams,
Did I make a mistake or you had the linear mean in your notebook as well? If the latter, we have a diagnostic function to compute rmse. We could use that one.
I did:
obs_error = np.sqrt((obs_error ** 2).resample(time="MS").mean())
Let me double check. Is this OK?
- dataarrays["rms_sic_obs_error"] = (sic_obs_err**2).mean(dims) ** (1 / 2)
+ dataarrays["rms_sic_obs_error"] = diagnostics.spatial_weighted_rmse(sic_model, sic_obs, weights=False)
dataarrays["rms_sic_obs_error"].attrs = {
"standard_name": "root_mean_square_sea_ice_concentration_observation_error",
"units": "%",
"long_name": "Root mean square sea ice concentration observation error",
}
I had a typo and edited the snippet above. RMSE should be computed from sic_model
and sic_obs
.
Oh actually, I think I misunderstood your comment. Is this the right algorithm?
I did:
obs_error = np.sqrt((obs_error ** 2).resample(time="MS").mean())
This should fix it: https://github.com/bopen/c3s-eqc-toolbox-template/commit/92cb912752d3624c5c480f12cc904d58657fdabb
I'm going to re-run the code overnight.
hi @malmans2 - yep that looks good. It probably would have been clearer if I had said RMS (what I meant, and what you did in the end) not RMSE...
Hi @tdcwilliams,
Everything is cached. Let me know if the results are OK!
Here is the template: https://github.com/bopen/c3s-eqc-toolbox-template/blob/main/notebooks/wp4/cmip6_sea_ice_evaluation.ipynb
Here is the notebook executed: https://gist.github.com/malmans2/3db833049d5bed834759113179ddb8ce
Hi @tdcwilliams,
Did you have a chance to test this notebook? If it works OK, do you think you can use the sea ice templates for #103?
Hi @malmans2 I must have forgotten to write back, but I think it works OK. I'm on sick leave until Friday, but I can double-check then. I'll try out #103 as well using it. Ciao, Tim
Sounds good. Get well soon!
Notebook description
evaluate CMIP6 historical models against sea ice concentration from passive microwave satellite data
Notebook link or upload
Here is a draft notebook. Problems
get_sic_errors
perhaps needs another argumentmissing_months
to give the missing months for that satellite data origincmip6_eval_siconc.ipynb.zip
Anything else we need to know?
WP4
Environment
standard wp4 environment on the virtual machine