Closed Zeitsperre closed 1 year ago
Nice! I didn't read everything but indeed, back in the day, @tlvu warned us of using another git repo for the testing data. On one hand, I'm not convinced we need to be able to run tests on older versions. In theory, at the time of release, all tests were passing, no? On the other one, I realize we use the testing data in the notebooks and not being able to reproduce those seems more problematic to me.
Thus indeed, I guess that tagging a testdata version would help solve this! (Long, I tagged you in cas you had any advice? This comment refers to the first box of section Functionality above)
My previous worries about splitting the testdata with the code https://github.com/Ouranosinc/xclim-testdata/pull/1#issuecomment-704281014
So tagging the testdata should solve this reproducibility issue.
But to ensure smooth dev workflow, the code should allow overriding the tag with a branch name. During dev cycle, both the testdata and the code would most probably move together. Without the override capability, we will have to continuously tag the testdata so it can be used with the code and this can get tedious.
However, with this tag override capability, we must not forget to tag the final version of the testdata and bump that tag on the code side before merging. The tag should be the default value when no override is used.
@tlvu
Thanks for the suggestion on how best to proceed for this. We now have a testing data tagging scheme and some GitHub Actions to prevent us from accidentally breaking it during some development branch. It's nothing fancy, but we should be able to more easily test older versions of xclim
going forward.
Except for #1342, we managed to address all major comments in under a work-week. Nicely done, team!
Originally posted by @jmunroe in https://github.com/pyOpenSci/software-submission/issues/73#issuecomment-1484520663
This issue is a tickbox summary of comments from the reviewer that seemed addressable in the near-term.
The foreword to the review:
Documentation
TJS: This is addressed in #1338
TJS: This is addressed in #1338
TJS: This is addressed in #1338
Functionality
TJS: The reason why this is occurring is because we made significant changes to our
xclim-testdata
repository in recent versions. I realize now that this is breaking because we aren't tagging explicit versions/commits of the testdata that are guaranteed to work. I'm thinking that we might want to start doing that from now on, rather than always point atmaster
. @aulemahal, what do you think? Update: This is addressed in #1339TJS:
pylint
is configured but we do not currently pass those compliance checks (run with allowed failure). If the amount of effort to get us passing is reasonable, I'll attempt to get this working.Review Comments
Installation notes
xclim
documentation under Installation, I created a separate conda environment to install the required dependencies:And there I hit my first issue:
The fix (at least for conda 22.11.1) is that
--file
is an option to pass toconda env create
and notconda create
. This needs to be fixed in the install instructions.TJS: This is addressed in #1338
flox
,SBCK
,eigen
,eofs
,pybind
. Since I used conda'senvironment.yml
the extraseofs
,eigen
,pybind11
were already included.I confess I tend to get confused when there is the option of using either
environment.yml
andrequirements_*.txt
files. So, I skipped the instructions following 'Extra Dependencies' in the documentation. I assume there must be situtations when I should and should not install these extra dependencies but as a new user of the package, I don't what those situations are yet. Since theses installation instructions are right near the top of the documenation, perhaps it would be better for the maintainers to make those choices for me? For example, I am now wondering "should I be installingflox
?". Since it is 'highly recommended', would it not make more sense to have it as part of the default instructions?TJS: This is addressed in #1338
Basic Usage
My initial reading of this code made me think that this ERA5 dataset was something I need to first download locally (I did not distinguish between
xr.open_dataset
andopen_dataset
in my very first glance at the code). After some review, I see now that there companion GitHub repo that was available that had testing data and thexclim.testing
API automatically makes a locally cached copy of this file. I think it would be clearer if this very first example was written out asso that it was clear that the
open_dataset
was utility method of the testing framework for xclim.TJS: This is addressed in #1338
In the example of Health checks and metadata attributes there is a typo:
should be
TJS: This is addressed in #1338
While in-code comments are generally fine, these last few examples on graphics feel tacked on given the strong narrative text established in the beginning of the Basic Usage section of the documentation.
TJS: This is addressed in #1338
Examples
Workflow Examples
Minor spelling error in the docs:
xarray
is tightly in(t)egrated withdask
, a package that can automatically parallelize operations.TJS: This is addressed in #1338
This is confusing because, as the first example workflow, the user has not yet been shown to use the clisops package. Should there be a subsub-section immediately before such as Subsetting and selecting data with cliops to demonstrate that recommended workflow?
TJS: This is addressed in #1338
that leads to the warning
I think the offending line should be changed to
(and elsewhere in the documenation where seaborn styles are used)
TJS: This is addressed in #1338
TJS: This is addressed in #1338
don't appear to match the values used in the code. I assume the code comments just need to be updated.
PB: This is addressed in #1338
Ensemble-Reductinon Techniques
Is this "criteria" array effectively the equivalent of creating a feature matrix used in data science?
TJS: This is addressed in #1341
Ensemble-Reductinon Techniques
Statistical Downscaling and Bias-Adjustment
TJS: Many typos and grammatical errors have been addressed in #1338
TJS: Many typos and grammatical errors have been addressed in #1338