Comments on ECAHM notebook

@jukent - we got review comments back on our ECAHM notebook. Can you take a first pass at working through the comments next week?

CONTRIBUTION DETAILS -------------------- ID: 143 Title: Scikit-downscale: an open source Python package for scalable climate downscaling REVIEW RESULT OF THE PROGRAM COMMITTEE: This contribution has been accepted for a plenary session. OVERVIEW OF REVIEWS ------------------- Review 1 ======== Evaluation of the Contribution ------------------------------ Overall Recommendation (100%): 9 Total points (out of 10) : 9 Comments for the Authors ------------------------ I rank this notebook as exemplary in _almost_ every way. For a time I thought there was a problem in code segment 28, but the results did appear after _many_ minutes (probably reflecting how nontrivial certain of the demonstrated computations actually are). In my view this notebook runs correctly and: - exhibits (in its non-executable segments) a writing style that is simultaneously scholarly and suitably explanatory for readers with a range of (technical and scientific) expertise. - makes excellent use of mathematical notation and nicely-clarifying use of (non-animated) data visualization. - exhibits understandable (bite-sized) code segments and, where necessary, elaborates these with (non-executable) pseudo-code to explain underlying methods whose codings are too complex or lengthy to be exposed directly in the notebook. - could serve as a model for scientific reproducibility, including ways to study variations among similar--but not identical--case studies and downscaling methods. - employs/demonstrates sophisticated and elegant software patterns, notably including generalization of a 1-dimensional pattern to make it applicable in higher dimensional contexts. - demonstrates use of parallel computation, which may surprise anyone who thinks notebooks are of lesser value in high-performance computing (HPC) contexts. - addresses directly a multidisciplinary set of needs and (by my take) appears readily applicable to additional domains. - makes good use of formal citations that give appropriate credit to prior work, both scientific and methodological. I do suggest a) giving more credit to the developers of software packages on which this notebook relies and b) acknowledging the source of grant moneys that (I assume) supported work reflected the notebook. - fails to connect itself with EarthCube explicitly or indirectly (i.e., with resources that are themselves linked to EarthCube). This is _not_ a comment on the caliber of the notebook but _may_ detract from its suitability for presentation at the EarthCube Annual Meeting, even though it clearly illustrates principles I consider central to EarthCube's mission. -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Review 2 ======== Evaluation of the Contribution ------------------------------ Overall Recommendation (100%): 9 Total points (out of 10) : 9 Comments for the Authors ------------------------ Overall, very well-done and polished notebook. I particularly like the focus on including a great deal of information, as well as various relevant links peppered throughout. It was also helpful to see the key features of Scikit-downscale laid out in a bulleted list near the beginning, as this helps put the entire notebook into context. Seeing how closely integrated this package is with scikit-learn is excellent, as most Python data analysts are familiar with that workflow, and this could enable scikit-downscale to be easily added, without much overhead, to existing workflows. In addition to documentation, the code within the cells is extremely well-commented. * Minor Comments for the Authors - Misspelling in Section 2, first bullet - should be "fit / predict" not "fit / precict" - Grammar edit in Section 2.1 first sentence - Section 2.2, spelling "opperate" - Since these models may vary based on sampled data, the values computed may change from run to run. This may not ultimately be a large problem, but if it is believed that an edge case could be encountered in the long run, it may be good to set the random state of the notebook. This way, users would experience exactly the same results as intended at time of writing. - Section 2.2, spelling "seemless" should be "seamless" - Section 2.3, the final 2 plots. It is a bit hard to discriminate the 4 curves shown; I'd suggest using colors that include more contrast to make it easier to visualize - Following Section 2.4 is another Section 2.2, should this be 2.5? - The final code cell where the eager computation takes place runs for an appreciable amount of time. Climate scientists will likely expect this, but it may be good to include a line of text explaining that line of code might take a while to run. -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Review 3 ======== Evaluation of the Contribution ------------------------------ Overall Recommendation (100%): 8 Total points (out of 10) : 8 Comments for the Authors ------------------------ This notebook demonstrates features of a new Python library (Scikit-downscale) used to downscale large ESM datasets. Overall the notebook is well written and adequately demonstrates the tool. A few comments are below. Suggestions: 1) Dropping into the .fit/.predict methodology right away is a little confusing in this example. I would state more clearly what your data and target is exactly that is being loaded from utils. Is target in this case just the higher resolution version? Is this just for evaluation, or is it actually training a model? From your plots this is not immediately obvious. 2) The section on spatial models is a little weird as it states how they are implemented, but then hints that none are implemented yet but it is future work. If none are currently coded up, I would omit this section for now. 3) There are quite a few spelling mistakes sprinkled through document.

Yup! I will look at this Monday.

On Fri, Jun 5, 2020, 4:02 PM Joe Hamman notifications@github.com wrote:

@jukent https://github.com/jukent - we got review comments back on our ECAHM notebook. Can you take a first pass at working through the comments next week? CONTRIBUTION DETAILS -------------------- ID: 143 Title: Scikit-downscale: an open source Python package for scalable climate downscaling

REVIEW RESULT OF THE PROGRAM COMMITTEE: This contribution has been accepted for a plenary session. OVERVIEW OF REVIEWS Review 1 Evaluation of the Contribution

Overall Recommendation (100%): 9 Total points (out of 10) : 9 Comments for the Authors

I rank this notebook as exemplary in almost every way. For a time I thought there was a problem in code segment 28, but the results did appear after many minutes (probably reflecting how nontrivial certain of the demonstrated computations actually are). In my view this notebook runs correctly and:

exhibits (in its non-executable segments) a writing style that is simultaneously scholarly and suitably explanatory for readers with a range of (technical and scientific) expertise.

makes excellent use of mathematical notation and nicely-clarifying use of (non-animated) data visualization.

exhibits understandable (bite-sized) code segments and, where necessary, elaborates these with (non-executable) pseudo-code to explain underlying methods whose codings are too complex or lengthy to be exposed directly in the notebook.

could serve as a model for scientific reproducibility, including ways to study variations among similar--but not identical--case studies and downscaling methods.

employs/demonstrates sophisticated and elegant software patterns, notably including generalization of a 1-dimensional pattern to make it applicable in higher dimensional contexts.

demonstrates use of parallel computation, which may surprise anyone who thinks notebooks are of lesser value in high-performance computing (HPC) contexts.

addresses directly a multidisciplinary set of needs and (by my take) appears readily applicable to additional domains.

makes good use of formal citations that give appropriate credit to prior work, both scientific and methodological. I do suggest a) giving more credit to the developers of software packages on which this notebook relies and b) acknowledging the source of grant moneys that (I assume) supported work reflected the notebook.

fails to connect itself with EarthCube explicitly or indirectly (i.e., with resources that are themselves linked to EarthCube). This is not a comment on the caliber of the notebook but may detract from its suitability for presentation at the EarthCube Annual Meeting, even though it clearly illustrates principles I consider central to EarthCube's mission.

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Review 2 Evaluation of the Contribution

Overall Recommendation (100%): 9 Total points (out of 10) : 9 Comments for the Authors

Overall, very well-done and polished notebook. I particularly like the focus on including a great deal of information, as well as various relevant links peppered throughout. It was also helpful to see the key features of Scikit-downscale laid out in a bulleted list near the beginning, as this helps put the entire notebook into context. Seeing how closely integrated this package is with scikit-learn is excellent, as most Python data analysts are familiar with that workflow, and this could enable scikit-downscale to be easily added, without much overhead, to existing workflows. In addition to documentation, the code within the cells is extremely well-commented.

Minor Comments for the Authors

Misspelling in Section 2, first bullet - should be "fit / predict" not "fit / precict"

Grammar edit in Section 2.1 first sentence

Section 2.2, spelling "opperate"

Since these models may vary based on sampled data, the values computed may change from run to run. This may not ultimately be a large problem, but if it is believed that an edge case could be encountered in the long run, it may be good to set the random state of the notebook. This way, users would experience exactly the same results as intended at time of writing.

Section 2.2, spelling "seemless" should be "seamless"

Section 2.3, the final 2 plots. It is a bit hard to discriminate the 4 curves shown; I'd suggest using colors that include more contrast to make it easier to visualize

Following Section 2.4 is another Section 2.2, should this be 2.5?

The final code cell where the eager computation takes place runs for an appreciable amount of time. Climate scientists will likely expect this, but it may be good to include a line of text explaining that line of code might take a while to run.

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Review 3 Evaluation of the Contribution

Overall Recommendation (100%): 8 Total points (out of 10) : 8 Comments for the Authors

This notebook demonstrates features of a new Python library (Scikit-downscale) used to downscale large ESM datasets. Overall the notebook is well written and adequately demonstrates the tool. A few comments are below.

Suggestions:

Dropping into the .fit/.predict methodology right away is a little confusing in this example. I would state more clearly what your data and target is exactly that is being loaded from utils. Is target in this case just the higher resolution version? Is this just for evaluation, or is it actually training a model? From your plots this is not immediately obvious.

The section on spatial models is a little weird as it states how they are implemented, but then hints that none are implemented yet but it is future work. If none are currently coded up, I would omit this section for now.

There are quite a few spelling mistakes sprinkled through document.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jhamman/scikit-downscale/issues/23, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALEGIO34ZXIOCBL2ARCMM5TRVFTPTANCNFSM4NVHPHIA .

Joe,

Are you taking the lead on giving the presentation tomorrow? And do we know exactly what time this notebook will be presented?

Julia

On Fri, Jun 5, 2020 at 4:36 PM Julia Kent jkent@ucar.edu wrote:

Yup! I will look at this Monday.

On Fri, Jun 5, 2020, 4:02 PM Joe Hamman notifications@github.com wrote:

@jukent https://github.com/jukent - we got review comments back on our ECAHM notebook. Can you take a first pass at working through the comments next week? CONTRIBUTION DETAILS -------------------- ID: 143 Title: Scikit-downscale: an open source Python package for scalable climate downscaling

REVIEW RESULT OF THE PROGRAM COMMITTEE: This contribution has been accepted for a plenary session. OVERVIEW OF REVIEWS Review 1 Evaluation of the Contribution

Overall Recommendation (100%): 9 Total points (out of 10) : 9 Comments for the Authors

I rank this notebook as exemplary in almost every way. For a time I thought there was a problem in code segment 28, but the results did appear after many minutes (probably reflecting how nontrivial certain of the demonstrated computations actually are). In my view this notebook runs correctly and:

exhibits (in its non-executable segments) a writing style that is simultaneously scholarly and suitably explanatory for readers with a range of (technical and scientific) expertise.

makes excellent use of mathematical notation and nicely-clarifying use of (non-animated) data visualization.

exhibits understandable (bite-sized) code segments and, where necessary, elaborates these with (non-executable) pseudo-code to explain underlying methods whose codings are too complex or lengthy to be exposed directly in the notebook.

could serve as a model for scientific reproducibility, including ways to study variations among similar--but not identical--case studies and downscaling methods.

employs/demonstrates sophisticated and elegant software patterns, notably including generalization of a 1-dimensional pattern to make it applicable in higher dimensional contexts.

demonstrates use of parallel computation, which may surprise anyone who thinks notebooks are of lesser value in high-performance computing (HPC) contexts.

addresses directly a multidisciplinary set of needs and (by my take) appears readily applicable to additional domains.

makes good use of formal citations that give appropriate credit to prior work, both scientific and methodological. I do suggest a) giving more credit to the developers of software packages on which this notebook relies and b) acknowledging the source of grant moneys that (I assume) supported work reflected the notebook.

fails to connect itself with EarthCube explicitly or indirectly (i.e., with resources that are themselves linked to EarthCube). This is not a comment on the caliber of the notebook but may detract from its suitability for presentation at the EarthCube Annual Meeting, even though it clearly illustrates principles I consider central to EarthCube's mission.

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Review 2 Evaluation of the Contribution

Overall Recommendation (100%): 9 Total points (out of 10) : 9 Comments for the Authors

Overall, very well-done and polished notebook. I particularly like the focus on including a great deal of information, as well as various relevant links peppered throughout. It was also helpful to see the key features of Scikit-downscale laid out in a bulleted list near the beginning, as this helps put the entire notebook into context. Seeing how closely integrated this package is with scikit-learn is excellent, as most Python data analysts are familiar with that workflow, and this could enable scikit-downscale to be easily added, without much overhead, to existing workflows. In addition to documentation, the code within the cells is extremely well-commented.

Minor Comments for the Authors

Misspelling in Section 2, first bullet - should be "fit / predict" not "fit / precict"

Grammar edit in Section 2.1 first sentence

Section 2.2, spelling "opperate"

Since these models may vary based on sampled data, the values computed may change from run to run. This may not ultimately be a large problem, but if it is believed that an edge case could be encountered in the long run, it may be good to set the random state of the notebook. This way, users would experience exactly the same results as intended at time of writing.

Section 2.2, spelling "seemless" should be "seamless"

Section 2.3, the final 2 plots. It is a bit hard to discriminate the 4 curves shown; I'd suggest using colors that include more contrast to make it easier to visualize

Following Section 2.4 is another Section 2.2, should this be 2.5?

The final code cell where the eager computation takes place runs for an appreciable amount of time. Climate scientists will likely expect this, but it may be good to include a line of text explaining that line of code might take a while to run.

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Review 3 Evaluation of the Contribution

Overall Recommendation (100%): 8 Total points (out of 10) : 8 Comments for the Authors

This notebook demonstrates features of a new Python library (Scikit-downscale) used to downscale large ESM datasets. Overall the notebook is well written and adequately demonstrates the tool. A few comments are below.

Suggestions:

Dropping into the .fit/.predict methodology right away is a little confusing in this example. I would state more clearly what your data and target is exactly that is being loaded from utils. Is target in this case just the higher resolution version? Is this just for evaluation, or is it actually training a model? From your plots this is not immediately obvious.

The section on spatial models is a little weird as it states how they are implemented, but then hints that none are implemented yet but it is future work. If none are currently coded up, I would omit this section for now.

There are quite a few spelling mistakes sprinkled through document.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jhamman/scikit-downscale/issues/23, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALEGIO34ZXIOCBL2ARCMM5TRVFTPTANCNFSM4NVHPHIA .

-- Julia Kent I/O & Workflow Applications (IOWA) Software Engineer

Phone: x2488 Office: ML494B

Yes. I'll take the lead. The schedule just has us on at 12:30p PT. I wouldn't treat the order of panelists listed on the schedule as gold here. I plan to attend for the full hour.

Thanks again for the help here @jukent!

pangeo-data / scikit-downscale

Comments on ECAHM notebook #23