Closed acocac closed 4 days ago
:wave: @weiji14 @William-gregory we will conduct the review in this issue. Please read through the above information and let me know if you have any questions about the review process. Thank you :pray:
Hi @weiji14 @William-gregory I am just checking in on the status of the review. Are there any obstacles or is there anything I can assist you with?
Hi @annefou, apologies for the delay. I've just returned from vacation and will be getting on with the review this week. One initial point of note is that my environment build fails on my local machine (works fine on Binder). Am I doing something wrong?
Building the environment fails:
git clone https://github.com/eds-book-gallery/67a1e320-7c47-4ea9-8df8-e868326bc90b.git
cd 67a1e320-7c47-4ea9-8df8-e868326bc90b
conda env create -f .binder/environment.yml -n IceNet
ERROR: Failed building wheel for netCDF4
ERROR: Could not build wheels for netCDF4, which is required to install pyproject.toml-based projects
Hi @William-gregory, could I ask if you're using Windows?
The library is supported on Linux (potentially Unix).
Hi @bnubald, I'm attempting to run on MacOS
On my side, I tried in mybinder.org and on my laptop (creating a local environment as you did).
It works in mybinder.org out of the box but not on my laptop (MacOS too). But then I changed netCDF version to 1.6.5 and it seems to work. So maybe the version of netCDF could be changed to 1.6.5?
There could be a potential issue with versions > 1.6.0 (https://github.com/icenet-ai/icenet/issues/226). But, if its working, happy to consider.
Else, maybe using conda for the netcdf4 install which has macos listed as supported.
name: 67a1e320-7c47-4ea9-8df8-e868326bc90b
channels:
- conda-forge
- defaults
dependencies:
- libnetcdf
- pip
- python<3.12
- "tensorflow<2.16=cpu*"
- netcdf4=1.6.0
- pip:
- icenet @ git+https://github.com/icenet-ai/icenet@3654dc4954eca6d28e16b4876bd6538abd1f0c06
- ecmwflibs
I managed to get the install working by using netCDF4 version 1.6.5. I also included jupyter in the dependencies in order to launch the notebook. I'm now getting an error on from icenet.data.interfaces.cds import ERA5Downloader
ImportError: dlopen(/opt/miniconda3/envs/IceNet/lib/python3.11/site-packages/cf_units/_udunits2.cpython-311-darwin.so, 0x0002): symbol not found in flat namespace '_ut_ignore'
I managed to get the install working by using netCDF4 version 1.6.5. I also included jupyter in the dependencies in order to launch the notebook. I'm now getting an error on
from icenet.data.interfaces.cds import ERA5Downloader
ImportError: dlopen(/opt/miniconda3/envs/IceNet/lib/python3.11/site-packages/cf_units/_udunits2.cpython-311-darwin.so, 0x0002): symbol not found in flat namespace '_ut_ignore'
Would this happen to be on an Apple Silicon system?
Could you try using this as your conda env file please?
name: 67a1e320-7c47-4ea9-8df8-e868326bc90b
name: icenet-edsbook
channels:
- conda-forge
- defaults
dependencies:
- cf-units
- libnetcdf
- netcdf4=1.6.0
- pip
- python<3.12
- "tensorflow<2.16=cpu*"
- pip:
- ecmwflibs
- icenet @ git+https://github.com/icenet-ai/icenet@3654dc4954eca6d28e16b4876bd6538abd1f0c06
- notebook
Thanks @bnubald, that's done the trick. The notebook runs through fine now
Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide.
notebook.ipynb
) part of the notebook repository?Author names, affiliation and GitHub handle sighted in
notebook.ipynb
, but could not find ORCID. I'd highly recommend ORCID as a persistent identifier, and would also suggest placing all of these information in a CITATION.cff (https://citation-file-format.github.io) file if possible.
[ ] Does the notebook run in a local environment?
Installation worked locally on my Linux machine, and I was able to run all the notebook cells from start to end.
However, I tried to solve for the conda environment (at https://github.com/eds-book-gallery/67a1e320-7c47-4ea9-8df8-e868326bc90b/blob/main/.binder/environment.yml) on other operating systems using conda-lock. OSX-64 (Intel chip) and OSX-ARM64 (M1/M2/M3 chips) works ok too, but I got the following errors for Windows:
$ conda-lock lock --mamba --file environment.yml --platform win-64
Locking dependencies for ['linux-64', 'osx-64', 'osx-arm64', 'win-64']...
INFO:conda_lock.conda_solver:linux-64 using specs ['libnetcdf', 'pip *', 'python <3.12', 'tensorflow <2.16 cpu*']
Failed to parse json, Expecting value: line 1 column 1 (char 0)
Could not lock the environment for platform win-64
Could not solve for environment specs
The following package could not be installed
ββ tensorflow <2.16 cpu* does not exist (perhaps a typo or a missing channel).
{
"success": false
}
Recommendations:
jupyterlab
, need to add this to be able to launch Jupyter Lab directly after local installation.conda-forge
and defaults
channels in the conda environment.yml file (see ). Use nodefaults
instead to explicitly disable channel mixing, see https://conda-forge.org/docs/user/tipsandtricks/#using-multiple-channels for context.Yes, tested that Binder build at https://mybinder.org/v2/gh/eds-book-gallery/67a1e320-7c47-4ea9-8df8-e868326bc90b/review works, and that notebooks runs to the end (even on 4GB of RAM)!
ERA5, ORA5 and OSI SAF's sea-ice concentration data sources are linked to, but not properly cited with a DOI.
Purpose and highlights are clearly articulated at the start of the notebook.
The notebook shows an end-to-end example of how to download climate and sea-ice concentration data, pre-process it into a machine-learning ready format (TFRecord), train a U-Net based model on the data, and then visualise the results as an animation of predicted sea-ice concentration over time. Very nicely done!
Yes, plenty of markdown documentation in between code cells. Did leave some recommendations at https://github.com/eds-book-gallery/67a1e320-7c47-4ea9-8df8-e868326bc90b/pull/5#discussion_r1585744044 though on adding some extra code comments.
Mostly ok, see minor suggestions at https://github.com/eds-book-gallery/67a1e320-7c47-4ea9-8df8-e868326bc90b/pull/5
On the analysis, the limitations of the neural network method are not explicitly stated, but they do mention that IceNet is a probabilistic model, rather than a physics-informed model. The authors mention that an ensemble of trained models is able to generate forecasts with a lead time of 6 months, though these metrics are not quantified against the groundtruth in the notebook. Since a train/val/test set is generated, maybe it would be good to report the test RMSE values to give an indication of performance. Also, there is no mention of how good the model is at predicting extreme highs/lows in sea-ice concentration, only mean/averages, but that is a whole other topic for discussion.
On the datasets, there is some mention of how ERA5 and ORA5 are reanalysis datasets which have no spatiotemporal gaps, but no mention about the uncertainties associated with these datasets that are derived from physical and observational data. Are they biased to certain geographies (e.g. the Northern Hemisphere has more historical data coverage) and/or climate regimes? Does the model only work on contemporary timescales for the ERA5 period (1940-present), or can it be used for paleo-timescales (spanning thousands of years), etc?
There might be some geopolitical implications with decreasing sea-ice, especially in the Arctic, but unsure if this needs to be mentioned in the notebook.
Missing citations for
cartopy
,matplotlib
,xarray
, etc.
See review comments at https://github.com/eds-book-gallery/67a1e320-7c47-4ea9-8df8-e868326bc90b/pull/5.
[x] Notebook: Is the notebook file part of the PR?
[x] Contribution and authorship: Does the author list seem appropriate and complete (full name, affiliation, and GitHub/ORCID handle)?
No ORCID, but names, affiliations and GitHub handles present
[x] Scope and eligibility: Does the submission contain an original and complete analysis according to the theme selected?
[x] Does the notebook run in a local environment?
Runs on local MacOS (silicon M1 chip) only with new environment file:
name: 67a1e320-7c47-4ea9-8df8-e868326bc90b name: icenet-edsbook channels: - conda-forge - defaults dependencies: - cf-units - libnetcdf - netcdf4=1.6.0 - pip - python<3.12 - "tensorflow<2.16=cpu*" - pip: - ecmwflibs - icenet @ git+https://github.com/icenet-ai/icenet@3654dc4954eca6d28e16b4876bd6538abd1f0c06 - notebook
- If downloading ERA5 data, need to change the
pd.date_range
end date toβ2020-04-30β
- I notice IceNet also has an
ORA5Downloader()
class. It would be great to include an example of how to use this in the notebook, since the notebook mentions that ORA5 data are used- The anchor links in the Highlights heading do not seem to work
[x] Does the notebook build and run in binder? mybinder repo
[ ] Are all data sources openly accessible and properly cited (e.g. with citation to a persistent DOI) in the heading section?
DOIs are missing for the specific OSI-SAF, ERA5 and ORA5 data used in the notebook
Yes, purpose is clear, as well as clear disclaimers as to how this version differs from that introduced by Andersson et al.
Yes, the notebook demonstrates advances in the development of an operational sea ice forecasting tool using deep learning. The workflow is complete with data downloaders and pre-processing tools, as well as training and inference steps for daily sea ice concentration prediction. The authors note that some of the more data-intensive components of the workflow have been withheld due to Binder's computational/memory limits. Improvements could be made in terms of prediction error assessment (see comment below)
Yes, the Summary heading highlights the main steps/functionalities of the notebook, and the Extended usage section provides some useful resources for the interested user
The original U-Net model introduced by Andersson was trained to do classification tasks (predict local monthly-mean sea ice extent), however the model here is refactored to a regression task (predict local daily sea ice concentration). It is unclear whether this new model would have the same predictive skill as the original model over the lead times explored in Andersson et al. Perhaps this is outside the scope of this notebook, but it would be interesting to see if you use the new model to predict an entire month of daily SIC (from which you can get daily SIE and then compute a monthly-mean), does it achieve similar skill to the original U-Net model's prediction to monthly SIE? I appreciate that for computational reasons the model in this notebook is limited in terms of training and inputs, but I think it would be useful to quantify what these limitations have on the prediction skill - especially if someone did want to use this for operational sea ice forecasting.
@bnubald the two reviewers suggested a few changes. Would you have time to have a look and implement their suggestions? Many thanks.
Thank you @weiji14, @William-gregory, @annefou, I will go through them, I might be a bit delayed (2-3 weeks) due to focusing on a conference.
Hi @bnubald, just checking with you. How is it going? Any progress with the review? Many thanks.
Hi @annefou, I've started going through the reviewer comments, will update soon.
Thank you for reviewing this submission @weiji14.
Apologies for submitting the changes in a bulk PR...
I have gone through and resolved the comments in ReviewNB, only the last one on changing the licensing remains (@annefou, would you be able to advice please? comment is here).
In addition to the inline comments in the notebook, please find below my responses to the checklist items (where actions were requested).
- [ ] Contribution and authorship: Does the author list seem appropriate and complete (full name, affiliation, and GitHub/ORCID handle)?
Author names, affiliation and GitHub handle sighted in
notebook.ipynb
, but could not find ORCID. I'd highly recommend ORCID as a persistent identifier, and would also suggest placing all of these information in a CITATION.cff (https://citation-file-format.github.io) file if possible.
Added CITATION.cff
with ORCID for authors.
Reproducibility
[ ] Does the notebook run in a local environment?
Installation worked locally on my Linux machine, and I was able to run all the notebook cells from start to end. However, I tried to solve for the conda environment (at https://github.com/eds-book-gallery/67a1e320-7c47-4ea9-8df8-e868326bc90b/blob/main/.binder/environment.yml) on other operating systems using conda-lock. OSX-64 (Intel chip) and OSX-ARM64 (M1/M2/M3 chips) works ok too, but I got the following errors for Windows:
...
Recommendations:
- State what operating systems / platforms this notebook is supported for, or fix installation issues on Windows. Will need to ensure that command-line interfaces work also on Windows if choosing the latter.
- Mention that this notebook should be able to run on a CPU device, and that a GPU won't be needed. Might need to test if this runs on macOS ARM chips.
- Conda environment.yml file does not have an explicit dependency on
jupyterlab
, need to add this to be able to launch Jupyter Lab directly after local installation.- Recommend to not have both
conda-forge
anddefaults
channels in the conda environment.yml file (see ). Usenodefaults
instead to explicitly disable channel mixing, see https://conda-forge.org/docs/user/tipsandtricks/#using-multiple-channels for context.
Compatible platforms
section.jupyterlab
to conda environment file.defaults
channel to nodefaults
.
- [ ] Are all data sources openly accessible and properly cited (e.g. with citation to a persistent DOI) in the heading section?
ERA5, ORA5 and OSI SAF's sea-ice concentration data sources are linked to, but not properly cited with a DOI.
Added citation with DOI for ERA5, ORA5 and OSI SAF's sea-ice concentration data sources in the notebook.
- [ ] Is the notebook truthful and clear about any limitations of the analysis (and potential biases in data and/or tools)?
On the analysis, the limitations of the neural network method are not explicitly stated, but they do mention that IceNet is a probabilistic model, rather than a physics-informed model. The authors mention that an ensemble of trained models is able to generate forecasts with a lead time of 6 months, though these metrics are not quantified against the groundtruth in the notebook. Since a train/val/test set is generated, maybe it would be good to report the test RMSE values to give an indication of performance. Also, there is no mention of how good the model is at predicting extreme highs/lows in sea-ice concentration, only mean/averages, but that is a whole other topic for discussion. On the datasets, there is some mention of how ERA5 and ORA5 are reanalysis datasets which have no spatiotemporal gaps, but no mention about the uncertainties associated with these datasets that are derived from physical and observational data. Are they biased to certain geographies (e.g. the Northern Hemisphere has more historical data coverage) and/or climate regimes? Does the model only work on contemporary timescales for the ERA5 period (1940-present), or can it be used for paleo-timescales (spanning thousands of years), etc?
Added a limitations section covering some aspects of this.
Since the model in the demonstrator notebook includes only a few thousand parameters, and is only run for 3 epochs, comparison of results have been omitted. And, some aspects of this are better suited to a paper format which seems out of scope, I have referred to the original IceNet paper in this case. We are currently working on a paper covering the up-to-date IceNet library, but it is still a work in progress.
- [ ] All mentioned software should be formally and consistently cited wherever possible.
Missing citations for
cartopy
,matplotlib
,xarray
, etc.
Added citations for python modules.
- [ ] Acronyms should be spelled out upon first mention.
See review comments at REVIEWΒ eds-book-gallery/67a1e320-7c47-4ea9-8df8-e868326bc90b#5.
Added an acronym section in the header.
Thank you for reviewing @William-gregory
Please see below for my updates.
- [ ] No ORCID, but names, affiliations and GitHub handles present
Added ORCID for authors.
- [ ] Runs on local MacOS (silicon M1 chip) only with new environment file
Updated environment file.
- [ ] If downloading ERA5 data, need to change the pd.date_range end date to β2020-04-30β
Updated end date.
- [ ] I notice IceNet also has an ORA5Downloader() class. It would be great to include an example of how to use this in the notebook, since the notebook mentions that ORA5 data are used
ORAS5Downloader wasn't used in this notebook since it also requires setting up of credentials, but an example code block has been added, just like the ERA5Downloader section.
- [ ] The anchor links in the Highlights heading do not seem to work
Fixed links
- [ ] DOIs are missing for the specific OSI-SAF, ERA5 and ORA5 data used in the notebook
Added citation with DOI for data sources
- [ ] The original U-Net model introduced by Andersson was trained to do classification tasks (predict local monthly-mean sea ice extent), however the model here is refactored to a regression task (predict local daily sea ice concentration). It is unclear whether this new model would have the same predictive skill as the original model over the lead times explored in Andersson et al. Perhaps this is outside the scope of this notebook, but it would be interesting to see if you use the new model to predict an entire month of daily SIC (from which you can get daily SIE and then compute a monthly-mean), does it achieve similar skill to the original U-Net model's prediction to monthly SIE? I appreciate that for computational reasons the model in this notebook is limited in terms of training and inputs, but I think it would be useful to quantify what these limitations have on the prediction skill - especially if someone did want to use this for operational sea ice forecasting.
This is along the lines of something we plan on tackling on a paper we are currently working on which covers the current state of IceNet, but I do think its moving outside the scope of this submission. The plan is to replicate the original study within this library for the paper before doing some comparisons, at which point this will be a more feasible study.
Hi @William-gregory and @weiji14 Thanks for reviewing the notebook. Could you confirm that everything is OK so we can proceed to the publication?
Hi @annefou, sorry for the slow response. I'm happy with the changes! Many thanks
Hi @annefou, I see that the requested changes are still sitting in a Pull Request at https://github.com/eds-book-gallery/67a1e320-7c47-4ea9-8df8-e868326bc90b/pull/6. Should those be merged into the main
branch first before proceeding with the final check, or do you prefer it us to do the final review in that pull request directly?
Hi @annefou, I see that the requested changes are still sitting in a Pull Request at eds-book-gallery/67a1e320-7c47-4ea9-8df8-e868326bc90b#6. Should those be merged into the
main
branch first before proceeding with the final check, or do you prefer it us to do the final review in that pull request directly?
Hi @weiji14, the review needs to be done in the pull request directly. Thanks.
Hi @weiji14, the review needs to be done in the pull request directly. Thanks.
Ok, I've gone through the Pull Request, and am mostly satisfied with the changes. There are still some minor typos and parts that needs updating (see e.g. https://github.com/eds-book-gallery/67a1e320-7c47-4ea9-8df8-e868326bc90b/pull/6#discussion_r1651952425), but happy for this notebook to move on to the next stage otherwise :rocket:
Thanks @William-gregory, @weiji14 for taking your time to go through this.
I've updated and pushed latest changes to the Pull Request.
@annefou @acocac
Thanks @William-gregory, @weiji14 for taking your time to go through this.
I've updated and pushed latest changes to the Pull Request.
@annefou @acocac
Awesome! @acocac I think the publication process can now start. Thank you!
@bnubald and team - Congratulations, your notebook is recommended for publication! π
Huge thanks to our editor: @annefou and reviewers: @William-gregory, @weiji14 β your contributions make this possible π
Next steps (optional for reviewers): @bnubald - I'll contact you for validating the post-print version and confirm a suitable date to announce the publication among the communication channels of the EDS book community.
@bnubald and team - Congratulations, your notebook is recommended for publication! π
Huge thanks to our editor: @annefou and reviewers: @William-gregory, @weiji14 β your contributions make this possible π
Next steps (optional for reviewers): @bnubald - I'll contact you for validating the post-print version and confirm a suitable date to announce the publication among the communication channels of the EDS book community.
That's great!
Thanks everyone (@annefou, @William-gregory, @weiji14, @acocac) for your help and support in getting this through. π
Notebook Review: Issue #236
Submitting author: @bnubald
Repository: https://github.com/eds-book-gallery/67a1e320-7c47-4ea9-8df8-e868326bc90b
Notebook idea issue: #221
Editor: @annefou
Reviewer: @weiji14 @William-gregory
Managing EiC: @acocac
Status
Reviewer instructions & questions
Hi π @weiji14 & @William-gregory, please carry out your review in this issue by updating the checklist below.
As a reviewer, you contribute to the technical quality of the content published by our community.
Before the review, EiC checked if the submission fits the minimum requirements.
The quality of the proposed contribution can be assessed through scientific, technical and code criteria.
The reviewer guidelines are available here: https://edsbook.org/publishing/guidelines/guidelines-reviewers.html. Any questions/concerns please let @annefou know.
Review checklist for @weiji14
Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide.
Conflict of interest
Code of conduct an peer-review principles
General checks
notebook.ipynb
) part of the notebook repository?Reproducibility
Pedagogy
Ethical
Other Requirements
Final approval (post-review)
Review checklist for @William-gregory
Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide.
Conflict of interest
Code of conduct an peer-review principles
General checks
notebook.ipynb
) part of the notebook repository?Reproducibility
Pedagogy
Ethical
Other Requirements
Final approval (post-review)
Additional instructions
Reviewer general comments are welcome on this REVIEW issue or directly to the notebook repository.
If you do the latter, you will find a Pull Request titled REVIEW where you can carry out the discussion with authors through ReviewNB, a third-party plugin in GitHub for displaying and commenting Jupyter Notebooks (see further details here).
In addition to ReviewNB, we suggest to explore or run the notebook in: