acocac commented 2 months ago

Notebook Review: Issue #236

Submitting author: @bnubald

Repository: https://github.com/eds-book-gallery/67a1e320-7c47-4ea9-8df8-e868326bc90b

Notebook idea issue: #221

Editor: @annefou

Reviewer: @weiji14 @William-gregory

Managing EiC: @acocac

Status

Reviewer instructions & questions

Hi 👋 @weiji14 & @William-gregory, please carry out your review in this issue by updating the checklist below.

As a reviewer, you contribute to the technical quality of the content published by our community.

Before the review, EiC checked if the submission fits the minimum requirements.

The quality of the proposed contribution can be assessed through scientific, technical and code criteria.

The reviewer guidelines are available here: https://edsbook.org/publishing/guidelines/guidelines-reviewers.html. Any questions/concerns please let @annefou know.

Review checklist for @weiji14

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide.

Conflict of interest

[ ] As the reviewer I confirm that there are no conflicts of interest for me to review this work (If you are unsure whether you are in conflict, please speak to your editor before starting your review).

Code of conduct an peer-review principles

[ ] I confirm that I read and will adhere to the Environmental Data Science code of conduct.
[ ] I confirm that I read and will adhere to Open Science Peer Review Oath

General checks

[ ] Notebook: Is the notebook file (notebook.ipynb) part of the notebook repository?
[ ] Contribution and authorship: Does the author list seem appropriate and complete (full name, affiliation, and GitHub/ORCID handle)?
[ ] Scope and eligibility: Does the submission contain an original and complete analysis according to the scope of EDS book?

Reproducibility

[ ] Does the notebook run in a local environment?
[ ] Does the notebook build and run in binder?
[ ] Are all data sources openly accessible and properly cited (e.g. with citation to a persistent DOI) in the heading section?

Pedagogy

[ ] Are the notebook purpose and highlights clear?
[ ] Does the notebook demonstrate some specific data analysis or visualisation techniques?
[ ] Is the notebook well documented, using both markdown cells and comments in code cells?
[ ] Does the conclusion section provide clear and concise final say on the tools, analysis and/or datasets used?
[ ] Is the notebook narrative well written (it does not require editing for structure, language, or writing quality)?

Ethical

[ ] Is any linkage of datasets in the notebook unlikely to lead to an increased risk of the personal identification of individuals?
[ ] Is the notebook truthful and clear about any limitations of the analysis (and potential biases in data and/or tools)?
[ ] Is the notebook unlikely to lead to negative social outcomes, such as (but not limited to) increasing discrimination or injustice?

Other Requirements

[ ] All mentioned software should be formally and consistently cited wherever possible.
[ ] Acronyms should be spelled out upon first mention.
[ ] License conditions on images and figures must be respected (Creative Commons, etc.).

Final approval (post-review)

[ ] Authors has responded to my review and made changes to my satisfaction. I recommend approving the notebook for publication.

Review checklist for @William-gregory

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide.

Conflict of interest

[ ] As the reviewer I confirm that there are no conflicts of interest for me to review this work (If you are unsure whether you are in conflict, please speak to your editor before starting your review).

Code of conduct an peer-review principles

[ ] I confirm that I read and will adhere to the Environmental Data Science code of conduct.
[ ] I confirm that I read and will adhere to Open Science Peer Review Oath

General checks

[ ] Notebook: Is the notebook file (notebook.ipynb) part of the notebook repository?
[ ] Contribution and authorship: Does the author list seem appropriate and complete (full name, affiliation, and GitHub/ORCID handle)?
[ ] Scope and eligibility: Does the submission contain an original and complete analysis according to the scope of EDS book?

Reproducibility

[ ] Does the notebook run in a local environment?
[ ] Does the notebook build and run in binder?
[ ] Are all data sources openly accessible and properly cited (e.g. with citation to a persistent DOI) in the heading section?

Pedagogy

[ ] Are the notebook purpose and highlights clear?
[ ] Does the notebook demonstrate some specific data analysis or visualisation techniques?
[ ] Is the notebook well documented, using both markdown cells and comments in code cells?
[ ] Does the conclusion section provide clear and concise final say on the tools, analysis and/or datasets used?
[ ] Is the notebook narrative well written (it does not require editing for structure, language, or writing quality)?

Ethical

[ ] Is any linkage of datasets in the notebook unlikely to lead to an increased risk of the personal identification of individuals?
[ ] Is the notebook truthful and clear about any limitations of the analysis (and potential biases in data and/or tools)?
[ ] Is the notebook unlikely to lead to negative social outcomes, such as (but not limited to) increasing discrimination or injustice?

Other Requirements

[ ] All mentioned software should be formally and consistently cited wherever possible.
[ ] Acronyms should be spelled out upon first mention.
[ ] License conditions on images and figures must be respected (Creative Commons, etc.).

Final approval (post-review)

[ ] Authors has responded to my review and made changes to my satisfaction. I recommend approving the notebook for publication.

Additional instructions

Reviewer general comments are welcome on this REVIEW issue or directly to the notebook repository.

If you do the latter, you will find a Pull Request titled REVIEW where you can carry out the discussion with authors through ReviewNB, a third-party plugin in GitHub for displaying and commenting Jupyter Notebooks (see further details here).

In addition to ReviewNB, we suggest to explore or run the notebook in:

Binder (run): Click the Launch Binder button at the top level of this message.

annefou commented 2 months ago

:wave: @weiji14 @William-gregory we will conduct the review in this issue. Please read through the above information and let me know if you have any questions about the review process. Thank you :pray:

annefou commented 2 months ago

Hi @weiji14 @William-gregory I am just checking in on the status of the review. Are there any obstacles or is there anything I can assist you with?

William-gregory commented 2 months ago

Hi @annefou, apologies for the delay. I've just returned from vacation and will be getting on with the review this week. One initial point of note is that my environment build fails on my local machine (works fine on Binder). Am I doing something wrong?

Reproducibility

[ ] Does the notebook run in a local environment?

Building the environment fails:

git clone https://github.com/eds-book-gallery/67a1e320-7c47-4ea9-8df8-e868326bc90b.git cd 67a1e320-7c47-4ea9-8df8-e868326bc90b conda env create -f .binder/environment.yml -n IceNet

ERROR: Failed building wheel for netCDF4 ERROR: Could not build wheels for netCDF4, which is required to install pyproject.toml-based projects

[x] Does the notebook build and run in binder? (mybinder repo)

bnubald commented 2 months ago

Hi @William-gregory, could I ask if you're using Windows?

The library is supported on Linux (potentially Unix).

William-gregory commented 2 months ago

Hi @bnubald, I'm attempting to run on MacOS

annefou commented 2 months ago

On my side, I tried in mybinder.org and on my laptop (creating a local environment as you did).

It works in mybinder.org out of the box but not on my laptop (MacOS too). But then I changed netCDF version to 1.6.5 and it seems to work. So maybe the version of netCDF could be changed to 1.6.5?

bnubald commented 2 months ago

There could be a potential issue with versions > 1.6.0 (https://github.com/icenet-ai/icenet/issues/226). But, if its working, happy to consider.

Else, maybe using conda for the netcdf4 install which has macos listed as supported.

name: 67a1e320-7c47-4ea9-8df8-e868326bc90b
channels:
  - conda-forge
  - defaults
dependencies:
  - libnetcdf
  - pip
  - python<3.12
  - "tensorflow<2.16=cpu*"
  - netcdf4=1.6.0
  - pip:
      - icenet @ git+https://github.com/icenet-ai/icenet@3654dc4954eca6d28e16b4876bd6538abd1f0c06
      - ecmwflibs

William-gregory commented 2 months ago

I managed to get the install working by using netCDF4 version 1.6.5. I also included jupyter in the dependencies in order to launch the notebook. I'm now getting an error on from icenet.data.interfaces.cds import ERA5Downloader

ImportError: dlopen(/opt/miniconda3/envs/IceNet/lib/python3.11/site-packages/cf_units/_udunits2.cpython-311-darwin.so, 0x0002): symbol not found in flat namespace '_ut_ignore'

bnubald commented 2 months ago

I managed to get the install working by using netCDF4 version 1.6.5. I also included jupyter in the dependencies in order to launch the notebook. I'm now getting an error on from icenet.data.interfaces.cds import ERA5Downloader

ImportError: dlopen(/opt/miniconda3/envs/IceNet/lib/python3.11/site-packages/cf_units/_udunits2.cpython-311-darwin.so, 0x0002): symbol not found in flat namespace '_ut_ignore'

Would this happen to be on an Apple Silicon system?

Could you try using this as your conda env file please?

name: 67a1e320-7c47-4ea9-8df8-e868326bc90b
name: icenet-edsbook
channels:
  - conda-forge
  - defaults
dependencies:
  - cf-units
  - libnetcdf
  - netcdf4=1.6.0
  - pip
  - python<3.12
  - "tensorflow<2.16=cpu*"
  - pip:
      - ecmwflibs
      - icenet @ git+https://github.com/icenet-ai/icenet@3654dc4954eca6d28e16b4876bd6538abd1f0c06
      - notebook

William-gregory commented 2 months ago

Thanks @bnubald, that's done the trick. The notebook runs through fine now

weiji14 commented 2 months ago

Review checklist for @weiji14

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide.

Conflict of interest

[x] As the reviewer I confirm that there are no conflicts of interest for me to review this work (If you are unsure whether you are in conflict, please speak to your editor before starting your review).

Code of conduct an peer-review principles

[x] I confirm that I read and will adhere to the Environmental Data Science code of conduct.
[x] I confirm that I read and will adhere to Open Science Peer Review Oath

General checks

[x] Notebook: Is the notebook file (notebook.ipynb) part of the notebook repository?
[ ] Contribution and authorship: Does the author list seem appropriate and complete (full name, affiliation, and GitHub/ORCID handle)?

Author names, affiliation and GitHub handle sighted in notebook.ipynb, but could not find ORCID. I'd highly recommend ORCID as a persistent identifier, and would also suggest placing all of these information in a CITATION.cff (https://citation-file-format.github.io) file if possible.
[x] Scope and eligibility: Does the submission contain an original and complete analysis according to the scope of EDS book?

Reproducibility

[ ] Does the notebook run in a local environment?

Installation worked locally on my Linux machine, and I was able to run all the notebook cells from start to end.

However, I tried to solve for the conda environment (at https://github.com/eds-book-gallery/67a1e320-7c47-4ea9-8df8-e868326bc90b/blob/main/.binder/environment.yml) on other operating systems using conda-lock. OSX-64 (Intel chip) and OSX-ARM64 (M1/M2/M3 chips) works ok too, but I got the following errors for Windows:
```
$ conda-lock lock --mamba --file environment.yml --platform win-64
Locking dependencies for ['linux-64', 'osx-64', 'osx-arm64', 'win-64']...
INFO:conda_lock.conda_solver:linux-64 using specs ['libnetcdf', 'pip *', 'python <3.12', 'tensorflow <2.16 cpu*']
Failed to parse json, Expecting value: line 1 column 1 (char 0)
Could not lock the environment for platform win-64
Could not solve for environment specs
The following package could not be installed
└─ tensorflow <2.16 cpu* does not exist (perhaps a typo or a missing channel).
{
    "success": false
}
```
Recommendations:
- State what operating systems / platforms this notebook is supported for, or fix installation issues on Windows. Will need to ensure that command-line interfaces work also on Windows if choosing the latter.
- Mention that this notebook should be able to run on a CPU device, and that a GPU won't be needed. Might need to test if this runs on macOS ARM chips.
- Conda environment.yml file does not have an explicit dependency on jupyterlab, need to add this to be able to launch Jupyter Lab directly after local installation.
- Recommend to not have both conda-forge and defaults channels in the conda environment.yml file (see ). Use nodefaults instead to explicitly disable channel mixing, see https://conda-forge.org/docs/user/tipsandtricks/#using-multiple-channels for context.
[x] Does the notebook build and run in binder?

Yes, tested that Binder build at https://mybinder.org/v2/gh/eds-book-gallery/67a1e320-7c47-4ea9-8df8-e868326bc90b/review works, and that notebooks runs to the end (even on 4GB of RAM)!
[ ] Are all data sources openly accessible and properly cited (e.g. with citation to a persistent DOI) in the heading section?

ERA5, ORA5 and OSI SAF's sea-ice concentration data sources are linked to, but not properly cited with a DOI.

Pedagogy

[x] Are the notebook purpose and highlights clear?

Purpose and highlights are clearly articulated at the start of the notebook.
[x] Does the notebook demonstrate some specific data analysis or visualisation techniques?

The notebook shows an end-to-end example of how to download climate and sea-ice concentration data, pre-process it into a machine-learning ready format (TFRecord), train a U-Net based model on the data, and then visualise the results as an animation of predicted sea-ice concentration over time. Very nicely done!
[x] Is the notebook well documented, using both markdown cells and comments in code cells?

Yes, plenty of markdown documentation in between code cells. Did leave some recommendations at https://github.com/eds-book-gallery/67a1e320-7c47-4ea9-8df8-e868326bc90b/pull/5#discussion_r1585744044 though on adding some extra code comments.
[x] Does the conclusion section provide clear and concise final say on the tools, analysis and/or datasets used?
[ ] Is the notebook narrative well written (it does not require editing for structure, language, or writing quality)?

Mostly ok, see minor suggestions at https://github.com/eds-book-gallery/67a1e320-7c47-4ea9-8df8-e868326bc90b/pull/5

Ethical

[x] Is any linkage of datasets in the notebook unlikely to lead to an increased risk of the personal identification of individuals?
[ ] Is the notebook truthful and clear about any limitations of the analysis (and potential biases in data and/or tools)?

On the analysis, the limitations of the neural network method are not explicitly stated, but they do mention that IceNet is a probabilistic model, rather than a physics-informed model. The authors mention that an ensemble of trained models is able to generate forecasts with a lead time of 6 months, though these metrics are not quantified against the groundtruth in the notebook. Since a train/val/test set is generated, maybe it would be good to report the test RMSE values to give an indication of performance. Also, there is no mention of how good the model is at predicting extreme highs/lows in sea-ice concentration, only mean/averages, but that is a whole other topic for discussion.

On the datasets, there is some mention of how ERA5 and ORA5 are reanalysis datasets which have no spatiotemporal gaps, but no mention about the uncertainties associated with these datasets that are derived from physical and observational data. Are they biased to certain geographies (e.g. the Northern Hemisphere has more historical data coverage) and/or climate regimes? Does the model only work on contemporary timescales for the ERA5 period (1940-present), or can it be used for paleo-timescales (spanning thousands of years), etc?
[x] Is the notebook unlikely to lead to negative social outcomes, such as (but not limited to) increasing discrimination or injustice?

There might be some geopolitical implications with decreasing sea-ice, especially in the Arctic, but unsure if this needs to be mentioned in the notebook.

Other Requirements

[ ] All mentioned software should be formally and consistently cited wherever possible.

Missing citations for cartopy, matplotlib, xarray, etc.
[ ] Acronyms should be spelled out upon first mention.

See review comments at https://github.com/eds-book-gallery/67a1e320-7c47-4ea9-8df8-e868326bc90b/pull/5.
[x] License conditions on images and figures must be respected (Creative Commons, etc.).

William-gregory commented 2 months ago

Conflict of interest

[x] As the reviewer I confirm that there are no conflicts of interest for me to review this work (If you are unsure whether you are in conflict, please speak to your editor before starting your review).

Code of conduct an peer-review principles

[x] I confirm that I read and will adhere to the Environmental Data Science code of conduct.
[x] I confirm that I read and will adhere to Open Science Peer Review Oath

General checks

[x] Notebook: Is the notebook file part of the PR?
[x] Contribution and authorship: Does the author list seem appropriate and complete (full name, affiliation, and GitHub/ORCID handle)?

No ORCID, but names, affiliations and GitHub handles present
[x] Scope and eligibility: Does the submission contain an original and complete analysis according to the theme selected?

Reproducibility

[x] Does the notebook run in a local environment?
Runs on local MacOS (silicon M1 chip) only with new environment file:
```
name: 67a1e320-7c47-4ea9-8df8-e868326bc90b
name: icenet-edsbook
channels:
  - conda-forge
  - defaults
dependencies:
  - cf-units
  - libnetcdf
  - netcdf4=1.6.0
  - pip
  - python<3.12
  - "tensorflow<2.16=cpu*"
  - pip:
      - ecmwflibs
      - icenet @ git+https://github.com/icenet-ai/icenet@3654dc4954eca6d28e16b4876bd6538abd1f0c06
      - notebook
```
- If downloading ERA5 data, need to change the pd.date_range end date to “2020-04-30”
- I notice IceNet also has an ORA5Downloader() class. It would be great to include an example of how to use this in the notebook, since the notebook mentions that ORA5 data are used
- The anchor links in the Highlights heading do not seem to work
[x] Does the notebook build and run in binder? mybinder repo
[ ] Are all data sources openly accessible and properly cited (e.g. with citation to a persistent DOI) in the heading section?

DOIs are missing for the specific OSI-SAF, ERA5 and ORA5 data used in the notebook

Pedagogy

[x] Are the notebook purpose and highlights clear?

Yes, purpose is clear, as well as clear disclaimers as to how this version differs from that introduced by Andersson et al.
[x] Does the notebook demonstrate some specific data analysis or visualisation techniques?

Yes, the notebook demonstrates advances in the development of an operational sea ice forecasting tool using deep learning. The workflow is complete with data downloaders and pre-processing tools, as well as training and inference steps for daily sea ice concentration prediction. The authors note that some of the more data-intensive components of the workflow have been withheld due to Binder's computational/memory limits. Improvements could be made in terms of prediction error assessment (see comment below)
[x] Is the notebook well documented, using both markdown cells and comments in code cells?
[x] Does the conclusion section provide clear and concise final say on the tools, analysis and/or datasets used?

Yes, the Summary heading highlights the main steps/functionalities of the notebook, and the Extended usage section provides some useful resources for the interested user
[x] Is the notebook narrative well written (it does not require editing for structure, language, or writing quality)?

Ethical

[x] Is any linkage of datasets in the notebook unlikely to lead to an increased risk of the personal identification of individuals?
[x] Is the notebook truthful and clear about any limitations of the analysis (and potential biases in data and/or tools)?

The original U-Net model introduced by Andersson was trained to do classification tasks (predict local monthly-mean sea ice extent), however the model here is refactored to a regression task (predict local daily sea ice concentration). It is unclear whether this new model would have the same predictive skill as the original model over the lead times explored in Andersson et al. Perhaps this is outside the scope of this notebook, but it would be interesting to see if you use the new model to predict an entire month of daily SIC (from which you can get daily SIE and then compute a monthly-mean), does it achieve similar skill to the original U-Net model's prediction to monthly SIE? I appreciate that for computational reasons the model in this notebook is limited in terms of training and inputs, but I think it would be useful to quantify what these limitations have on the prediction skill - especially if someone did want to use this for operational sea ice forecasting.
[x] Is the notebook unlikely to lead to negative social outcomes, such as (but not limited to) increasing discrimination or injustice?

Other Requirements

[x] All mentioned software should be formally and consistently cited wherever possible.
[x] Acronyms should be spelled out upon first mention.
[x] Licence conditions on images and figures must be respected (Creative Commons, etc.).

annefou commented 2 months ago

@bnubald the two reviewers suggested a few changes. Would you have time to have a look and implement their suggestions? Many thanks.

bnubald commented 2 months ago

Thank you @weiji14, @William-gregory, @annefou, I will go through them, I might be a bit delayed (2-3 weeks) due to focusing on a conference.

annefou commented 1 month ago

Hi @bnubald, just checking with you. How is it going? Any progress with the review? Many thanks.

bnubald commented 1 month ago

Hi @annefou, I've started going through the reviewer comments, will update soon.

bnubald commented 1 month ago

Thank you for reviewing this submission @weiji14.

Apologies for submitting the changes in a bulk PR...

I have gone through and resolved the comments in ReviewNB, only the last one on changing the licensing remains (@annefou, would you be able to advice please? comment is here).

In addition to the inline comments in the notebook, please find below my responses to the checklist items (where actions were requested).

Response to checklist items

[ ] Contribution and authorship: Does the author list seem appropriate and complete (full name, affiliation, and GitHub/ORCID handle)?

Author names, affiliation and GitHub handle sighted in notebook.ipynb, but could not find ORCID. I'd highly recommend ORCID as a persistent identifier, and would also suggest placing all of these information in a CITATION.cff (https://citation-file-format.github.io) file if possible.

Added CITATION.cff with ORCID for authors.

Reproducibility

[ ] Does the notebook run in a local environment?

Installation worked locally on my Linux machine, and I was able to run all the notebook cells from start to end. However, I tried to solve for the conda environment (at https://github.com/eds-book-gallery/67a1e320-7c47-4ea9-8df8-e868326bc90b/blob/main/.binder/environment.yml) on other operating systems using conda-lock. OSX-64 (Intel chip) and OSX-ARM64 (M1/M2/M3 chips) works ok too, but I got the following errors for Windows:

...

Recommendations:

State what operating systems / platforms this notebook is supported for, or fix installation issues on Windows. Will need to ensure that command-line interfaces work also on Windows if choosing the latter.

Mention that this notebook should be able to run on a CPU device, and that a GPU won't be needed. Might need to test if this runs on macOS ARM chips.

Conda environment.yml file does not have an explicit dependency on jupyterlab, need to add this to be able to launch Jupyter Lab directly after local installation.

Recommend to not have both conda-forge and defaults channels in the conda environment.yml file (see ). Use nodefaults instead to explicitly disable channel mixing, see https://conda-forge.org/docs/user/tipsandtricks/#using-multiple-channels for context.

Included a section on availability in the notebook under the Compatible platforms section.
Added mention that GPU is not mandatory.
Add jupyterlab to conda environment file.
Switched defaults channel to nodefaults.

[ ] Are all data sources openly accessible and properly cited (e.g. with citation to a persistent DOI) in the heading section?

ERA5, ORA5 and OSI SAF's sea-ice concentration data sources are linked to, but not properly cited with a DOI.

Added citation with DOI for ERA5, ORA5 and OSI SAF's sea-ice concentration data sources in the notebook.

[ ] Is the notebook truthful and clear about any limitations of the analysis (and potential biases in data and/or tools)?

On the analysis, the limitations of the neural network method are not explicitly stated, but they do mention that IceNet is a probabilistic model, rather than a physics-informed model. The authors mention that an ensemble of trained models is able to generate forecasts with a lead time of 6 months, though these metrics are not quantified against the groundtruth in the notebook. Since a train/val/test set is generated, maybe it would be good to report the test RMSE values to give an indication of performance. Also, there is no mention of how good the model is at predicting extreme highs/lows in sea-ice concentration, only mean/averages, but that is a whole other topic for discussion. On the datasets, there is some mention of how ERA5 and ORA5 are reanalysis datasets which have no spatiotemporal gaps, but no mention about the uncertainties associated with these datasets that are derived from physical and observational data. Are they biased to certain geographies (e.g. the Northern Hemisphere has more historical data coverage) and/or climate regimes? Does the model only work on contemporary timescales for the ERA5 period (1940-present), or can it be used for paleo-timescales (spanning thousands of years), etc?

Added a limitations section covering some aspects of this.

Since the model in the demonstrator notebook includes only a few thousand parameters, and is only run for 3 epochs, comparison of results have been omitted. And, some aspects of this are better suited to a paper format which seems out of scope, I have referred to the original IceNet paper in this case. We are currently working on a paper covering the up-to-date IceNet library, but it is still a work in progress.

[ ] All mentioned software should be formally and consistently cited wherever possible.

Missing citations for cartopy, matplotlib, xarray, etc.

Added citations for python modules.

[ ] Acronyms should be spelled out upon first mention.

See review comments at REVIEW eds-book-gallery/67a1e320-7c47-4ea9-8df8-e868326bc90b#5.

Added an acronym section in the header.

bnubald commented 1 month ago

Thank you for reviewing @William-gregory

Please see below for my updates.

Response to checklist items

[ ] No ORCID, but names, affiliations and GitHub handles present

Added ORCID for authors.

[ ] Runs on local MacOS (silicon M1 chip) only with new environment file

Updated environment file.

[ ] If downloading ERA5 data, need to change the pd.date_range end date to “2020-04-30”

Updated end date.

[ ] I notice IceNet also has an ORA5Downloader() class. It would be great to include an example of how to use this in the notebook, since the notebook mentions that ORA5 data are used

ORAS5Downloader wasn't used in this notebook since it also requires setting up of credentials, but an example code block has been added, just like the ERA5Downloader section.

[ ] The anchor links in the Highlights heading do not seem to work

Fixed links

[ ] DOIs are missing for the specific OSI-SAF, ERA5 and ORA5 data used in the notebook

Added citation with DOI for data sources

[ ] The original U-Net model introduced by Andersson was trained to do classification tasks (predict local monthly-mean sea ice extent), however the model here is refactored to a regression task (predict local daily sea ice concentration). It is unclear whether this new model would have the same predictive skill as the original model over the lead times explored in Andersson et al. Perhaps this is outside the scope of this notebook, but it would be interesting to see if you use the new model to predict an entire month of daily SIC (from which you can get daily SIE and then compute a monthly-mean), does it achieve similar skill to the original U-Net model's prediction to monthly SIE? I appreciate that for computational reasons the model in this notebook is limited in terms of training and inputs, but I think it would be useful to quantify what these limitations have on the prediction skill - especially if someone did want to use this for operational sea ice forecasting.

This is along the lines of something we plan on tackling on a paper we are currently working on which covers the current state of IceNet, but I do think its moving outside the scope of this submission. The plan is to replicate the original study within this library for the paper before doing some comparisons, at which point this will be a more feasible study.

annefou commented 3 weeks ago

Hi @William-gregory and @weiji14 Thanks for reviewing the notebook. Could you confirm that everything is OK so we can proceed to the publication?

William-gregory commented 3 weeks ago

Hi @annefou, sorry for the slow response. I'm happy with the changes! Many thanks

weiji14 commented 2 weeks ago

Hi @annefou, I see that the requested changes are still sitting in a Pull Request at https://github.com/eds-book-gallery/67a1e320-7c47-4ea9-8df8-e868326bc90b/pull/6. Should those be merged into the main branch first before proceeding with the final check, or do you prefer it us to do the final review in that pull request directly?

annefou commented 2 weeks ago

Hi @annefou, I see that the requested changes are still sitting in a Pull Request at eds-book-gallery/67a1e320-7c47-4ea9-8df8-e868326bc90b#6. Should those be merged into the main branch first before proceeding with the final check, or do you prefer it us to do the final review in that pull request directly?

Hi @weiji14, the review needs to be done in the pull request directly. Thanks.

weiji14 commented 1 week ago

Hi @weiji14, the review needs to be done in the pull request directly. Thanks.

Ok, I've gone through the Pull Request, and am mostly satisfied with the changes. There are still some minor typos and parts that needs updating (see e.g. https://github.com/eds-book-gallery/67a1e320-7c47-4ea9-8df8-e868326bc90b/pull/6#discussion_r1651952425), but happy for this notebook to move on to the next stage otherwise :rocket:

bnubald commented 1 week ago

Thanks @William-gregory, @weiji14 for taking your time to go through this.

I've updated and pushed latest changes to the Pull Request.

@annefou @acocac

annefou commented 1 week ago

Thanks @William-gregory, @weiji14 for taking your time to go through this.

I've updated and pushed latest changes to the Pull Request.

@annefou @acocac

Awesome! @acocac I think the publication process can now start. Thank you!

acocac commented 1 week ago

@bnubald and team - Congratulations, your notebook is recommended for publication! 🚀

Huge thanks to our editor: @annefou and reviewers: @William-gregory, @weiji14 — your contributions make this possible 🙏

Next steps (optional for reviewers): @bnubald - I'll contact you for validating the post-print version and confirm a suitable date to announce the publication among the communication channels of the EDS book community.

bnubald commented 1 week ago

@bnubald and team - Congratulations, your notebook is recommended for publication! 🚀

Huge thanks to our editor: @annefou and reviewers: @William-gregory, @weiji14 — your contributions make this possible 🙏

Next steps (optional for reviewers): @bnubald - I'll contact you for validating the post-print version and confirm a suitable date to announce the publication among the communication channels of the EDS book community.

That's great!

Thanks everyone (@annefou, @William-gregory, @weiji14, @acocac) for your help and support in getting this through. 👏

alan-turing-institute / environmental-ds-book

[REVIEW] Sea ice forecasting using the IceNet library #239

Notebook Review: Issue #236

Status

Review checklist for @weiji14

Conflict of interest

Code of conduct an peer-review principles

General checks

Reproducibility

Pedagogy

Ethical

Other Requirements

Final approval (post-review)

Review checklist for @William-gregory

Conflict of interest

Code of conduct an peer-review principles

General checks

Reproducibility

Pedagogy

Ethical

Other Requirements

Final approval (post-review)

Additional instructions

Reproducibility

Review checklist for @weiji14

Conflict of interest

Code of conduct an peer-review principles

General checks

Reproducibility

Pedagogy

Ethical

Other Requirements

Conflict of interest

Code of conduct an peer-review principles

General checks

Reproducibility

Pedagogy

Ethical

Other Requirements

Response to checklist items

Reproducibility

Response to checklist items