Closed TomAugspurger closed 4 years ago
cc @rabernat
Thanks so much for this. LGTM provided the build succeeds.
Would be awesome if a bot could make these PRs automatically. 😆
Possible data issue
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-19-7df12fa49c69> in <module>
6 return xr.DataArray(ecs)
7
----> 8 ds_abrupt['ecs'] = ds_abrupt.groupby('source_id').apply(calc_ecs)
9 ds_abrupt
...
2
3 def calc_ecs(ds):
----> 4 a, b = np.polyfit(ds.tas, ds.imbalance, 1)
5 ecs = -0.5 * (b/a)
6 return xr.DataArray(ecs)
<__array_function__ internals> in polyfit(*args, **kwargs)
/srv/conda/envs/notebook/lib/python3.7/site-packages/numpy/lib/polynomial.py in polyfit(x, y, deg, rcond, full, w, cov)
627 scale = NX.sqrt((lhs*lhs).sum(axis=0))
628 lhs /= scale
--> 629 c, resids, rank, s = lstsq(lhs, rhs, rcond)
630 c = (c.T/scale).T # broadcast scale coefficients
631
<__array_function__ internals> in lstsq(*args, **kwargs)
/srv/conda/envs/notebook/lib/python3.7/site-packages/numpy/linalg/linalg.py in lstsq(a, b, rcond)
2304 # lapack can't handle n_rhs = 0 - so allocate the array one larger in that axis
2305 b = zeros(b.shape[:-2] + (m, n_rhs + 1), dtype=b.dtype)
-> 2306 x, resids, rank, s = gufunc(a, b, rcond, signature=signature, extobj=extobj)
2307 if m == 0:
2308 x[...] = 0
ValueError: On entry to DLASCL parameter number 4 had an illegal value
source_id="NorCPM1"
is the (first) one with an issue
>>> np.isnan(ds_abrupt.sel(source_id="NorCPM1").tas).data.sum()
70
It looks like NorCPM1
just doesn't have observations for those years maybe.
>>> dsets_aligned_["NorCPM1"].year
array([ 0., 1., 2., 3., 4., 5., 6., 7., 8., 9., 10., 11., 12., 13.,
14., 15., 16., 17., 18., 19., 20., 21., 22., 23., 24., 25., 26., 27.,
28., 29., 30., 31., 32., 33., 34., 35., 36., 37., 38., 39., 40., 41.,
42., 43., 44., 45., 46., 47., 48., 49., 50., 51., 52., 53., 54., 55.,
56., 57., 58., 59., 60., 61., 62., 63., 64., 65., 66., 67., 68., 69.,
70., 71., 72., 73., 74., 75., 76., 77., 78., 79.])
So we have a few options:
I'll probably just drop missing values...
Now we are doing science via CI! 🚀
cc @hdrake, who was the original author of this notebook.
FYI, this notebook is failing with a different error with the production binder, I think since it was using its own image rather than default-binder
(dask cluster not starting).
https://staging.binder.pangeo.io/v2/gh/pangeo-gallery/default-binder/staging/?urlpath=git-pull?repo=https://github.com/pangeo-gallery/cmip6%26amp%3Burlpath=lab/tree/cmip6/ECS_Gregory_method.ipynb%3Fautodecode gets us to the point where the missing values in NorCPM1
cause issues.
I just find it so awesome that you are able to debug both the deep infrastructure and the actual science code. 🥇
Thinking about how to scale this and imagining that you would be a bot instead of a person.
It would be nice if a bot made a PR to update the binderbot config, just as you have done. (This is kind of like how conda forge handles new releases.) If the CI succeeds, then the owner just has to click merge. But if it fails, it would be nice for the bot to post the binder link, like you did, allowing the gallery owner to debug the failing notebook.
The icing on the cake would be to have some way to pass git credentials around so that the user could actually push fixed notebook from binder directly back to the PR branch! 🚀
AFAICT, the only differences are moving the base Dockerfile and pangeo-notebook images forward to 2020.08.31.
Using the default-binder image will make testing deployments of the binderhub a bit easier.