Closed Zeitsperre closed 1 year ago
To whom it may concern, our paper is available on a separate branch within xclim. The Pull Request containing the paper can be found here: https://github.com/Ouranosinc/xclim/pull/250
hi @Zeitsperre welcome to pyOpenSci!! are you a part of the pangeo community by chance? i think i saw xclim discussed in a few pangeo threads.
Hi @lwasser, nice to meet you!
We are not formally part of the Pangeo community (we do not receive compensation from them), however xclim is mentioned in their list of xarray-based projects (https://docs.xarray.dev/en/latest/ecosystem.html#ecosystem). A handful of xclim's developers have also made significant contributions to packages in the Pangeo ecosystem as well (xarray, xesmf, flox, cf-xarray, etc.) and some key developers of xarray have contributed to xclim, through issue raising or code contributions. I think it's safe to say that the objectives of xclim are in line with those of Pangeo.
awesome! nice to meet you as well @Zeitsperre ! I asked because we are working on a partnership with pangeo to curate that list via peer review. as a part of that curation, we have a small set of checks that are specific to the pangeo community including:
So we'd want to include you in that given your are in the list of tools. i haven't done a deep dive into your package but these are universal pangeo standards.If the review goes smoothly, we'd then tag your package as being pangeo community related / vetted.
do this package adhere to the above pangeo standards? Many thanks - i'm just trying to figure out the best path here to support our partnership :) i just started working on this and have an open pr here to update our review submission template!
I believe we tick all the boxes you listed:
dask
)xarray
is left to handle all file loading or writing)I know that the core developers of xclim would gladly welcome feedback on how to better organize/structure our code base, and xclim being considered high enough in quality to be formally Pangeo-endorsed would be great!
Just to add some clarification, I just remembered that we have some functionality to build indicator catalogues using YAML configurations (https://xclim.readthedocs.io/en/latest/notebooks/extendxclim.html#YAML-file).
We also built a translation engine for providing multi-language climate indicator metadata based on JSON descriptions (currently, we have translations to French, but any other languages are supported; https://xclim.readthedocs.io/en/latest/internationalization.html).
These files are loaded on import or can be supplied by the user explicitly and are solely used for configuration. Hope this helps!
perfect! Thank you @Zeitsperre for the speedy replies! I am going to followup and also get our editor in chief involved @NickleDave I stepped in only because i'm actively working with pangeo now so i just wanted to check in to see if this review could be supported via that collaboration as well. And it does sound like it could. more soon!
Welcome @Zeitsperre!
At first glance looks like you have got most everything in line with our initial checks.
I will add the completed checks by end of day tomorrow.
Hi again @Zeitsperre -- Happy to report that xclim
passed initial editor-in-chief checks (below) with flying colors.
I have one suggestion, which we strongly recommend, but do not require:
We expect to know how we will go about selecting editors by the end of day Monday (Jan. 23rd).
@lwasser needs to talk a little more with people about the Pangeo collaboration.
We will report back to you then with a timeline for the review.
Please check our Python packaging guide for more information on the elements below.
import package-name
.xclim
is and does. Getting hit with a long table of contents can be dauntingREADME.md
file with clear explanation of what the package does, instructions on how to install it, and a link to development instructions. CONTRIBUTING.md
file that details how to install and contribute to the package.Code of Conduct
file. YAML
header of the issue (located at the top of the issue template).Thanks, @NickleDave! I'm glad to see that all the hours spent reading PEPs and adopting good practices has been helpful here!
I'll be opening a PR later today to address the PEP 517 and 621 compliance comment. We don't do anything too fancy in our setup.py
, so I don't imagine it should be difficult to adopt pyproject.toml
. I'll link that here once it's underway.
@NickleDave
I just merged changes to address some of your comments, primarily for pyproject.toml
and the ReadMe suggestion. I opted to adopt flit
as it seemed to be a very stable and no-nonsense setup engine. It took a bit of time to migrate some configurations around, but it looks/works much better.
Agreed on your point about Basic Usage
. I'll be opening a ticket to address that down-the-line. There are a fair amount of Pull Requests currently open, but we tend to address things in a reasonable amount of time.
Let me know if you have any other suggestions, thanks!
Hi @Zeitsperre!
I just merged changes to address some of your comments
Awesome! Glad to hear it.
Agreed on your point about Basic Usage.
Thank you for even making note of that. It was just a nitpick and really more of a list to give an editor things they might want to revisit. Your docs look really great over all and I'm sure they'll look better after the review.
I'm sorry for not getting back to you by the end of the day yesterday--totally my fault.
I want to let you know we do have an editor, @Batalex! 🎉🎉🎉 who will introduce themselves and officially start the review in the next couple of days. We are actively recruiting reviewers right now as well.
Hi @Zeitsperre, I am pleased to meet you! I will be the editor for this review, and I will get up to speed during the next few days with the pre-submission discussions, the editorial process and the package itself before recruiting reviewers. I appreciate your patience; I am looking forward to working with you 🫡
Hello @Batalex, nice to meet you as well!
That all sounds great! I'm looking forward to hearing how we can improve xclim. Feel free to ping me if you have any questions or see anything obvious that could be improved in advance of the formal peer-review process.
Thanks for your time and effort towards this!
:wave: Hi @jmunroe and @aguspesce! Thank you for volunteering to review Xclim for pyOpenSci!
Meet @Zeitsperre, our lovely submission author, and feel free to introduce yourselves 🥰
The following resources will help you complete your review:
Please get in touch with any questions or concerns! I just wanted to let you know that your review is due: Feb 26th.
Also, I'd like to point out that the next PR should be part of your review: https://github.com/Ouranosinc/xclim/pull/250.
PS: I'm sorry for the information dump. Please make sure that you review the submitted version. However, some issues you could point out might have already been solved in the meantime, but that's ok.
Hey all,
I just wanted to keep people in the loop, but our next version (v0.41) is slated for release on February 24, 2023. Assuming that the degree of changes suggested/recommended is reasonable to address, I feel that the xclim devs can likely prioritize adjustments for the subsequent release (v0.42).
One question that we had concerning the JOSS paper was if it is required that the paper branch be pushed to the main development branch, or if it can be hosted solely in its own branch? What is typically performed concerning the JOSS requirements?
https://joss.readthedocs.io/en/latest/submitting.html#submission-requirements
Your paper (paper.md and BibTeX files, plus any figures) must be hosted in a Git-based repository together with your software (although they may be in a short-lived branch which is never merged with the default).
Hello @Zeitsperre, You need to put the paper in the main branch, as @NickleDave told you.
I might be reading their docs wrong, but I understood it to say that the paper files can be on some other branch that never gets merged with main.
See for example this published paper that has the files in a joss-rev branch: https://github.com/parmoo/parmoo/commits/joss-rev
I might be reading their docs wrong, but I understood it to say that the paper files can be on some other branch that never gets merged with main.
That's what I read as well, but I wanted to confirm. In any case, we can cross that bridge when we get there. The paper is still up-to-date with the current aims and structure of the package, but I would like to add a few comments and clarifications here and there; I anticipate that we'll be modifying it some more in the coming weeks.
Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
The package includes all the following forms of documentation:
pyproject.toml
file or elsewhere.Readme file requirements The package meets the readme requirements below:
The README should include, from top to bottom:
NOTE: If the README has many more badges, you might want to consider using a table for badges: see this example. Such a table should be more wide than high. (Note that the badge for pyOpenSci peer-review will be provided upon acceptance.)
Reviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole. Package structure should follow general community best-practices. In general please consider whether:
Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted.
The package contains a paper.md
matching JOSS's requirements with:
Estimated hours spent reviewing: 6
Thank you for the opportunity to review this code. The package looks very good and a lot of effort has been put into its development. I only have minor suggestions and observations, so I will enumerate them in order of priority.
About the documentation
I found some issues on the documentation website that don't look well.
The type annotations is a great tool, but sometimes it might confuse users when they read the API documentation.
For example, when a function has a lot of parameters, the type annotation can clutter its documentation.
So I suggest using the sphinx_autodoc_typehints
extension in docs/conf.py
which hide the type annotations when sphinx creates the documentation.
The documentation for indicators and indices has a format problem when it describes the return part.
For example, xclim.indicators.atmos.cffwis_indices
has multiple descriptions and this makes the return part difficult to understand:
Returns dc (DataArray) – Drought Code (drought_code), with additional attributes: description: Numeric rating of the average moisture content of deep, compact organic layers.dmc : DataArray Duff Moisture Code (duff_moisture_code), with additional attributes: description: Numeric rating of the average moisture content of loosely compacted organic layers of moderate depth.ffmc : DataArray Fine Fuel Moisture Code (fine_fuel_moisture_code), with additional attributes: description: Numeric rating of the average moisture content of litter and other cured fine fuels.isi : DataArray Initial Spread Index (initial_spread_index), with additional attributes: description: Numeric rating of the expected rate of fire spread.bui : DataArray Buildup Index (buildup_index), with additional attributes: description: Numeric rating of the total amount of fuel available for combustion.fwi : DataArray Fire Weather Index (fire_weather_index), with additional attributes: description: Numeric rating of fire intensity.
The API documentation is a very long page, and it is difficult to navigate and find definitions. This can be improved by adding a table of contents.
Why does xclim.sdba
API have a separate section, and it is not in the USER API session? As a user, who wants a description of any functionality of this subpackage, I would go to find it in that section. I think that it's better if all the APIs are in the USER API section.
About the repository
In my opinion, I like to see the licence type in the README file likes:
## License
This is free software: you can redistribute it and/or modify it under the terms
of the Apache License 2.0.
A copy of this license is provided in LICENSE.txt.
The CONTRIBUTING.rst
file is in .github/
directory. I suggest moving it to the main directory ./
because it's easier to find it.
When I ran the test using make test
, only one test failed:
FAILED xclim/testing/tests/test_atmos.py::TestMeanRadiantTemperature::test_mean_radiant_temperature
===== 1 failed, 1398 passed, 66 skipped, 2 xfailed, 1 xpassed, 95 warnings in 315.63s (0:05:15) ======
make: *** [Makefile:68: test] Error 1
The JOSS paper
xclim
started at Ouranos in 2018 out of the need to deliver data for a pan-Canadian atlas of climate indicator." . What it is Oranos? Maybe a line about this or a link to the website.The code
About the code, I don't have any objection. I have only noted that the code is sometimes difficult to follow. So, I just have a few observations that might help to improve this in future developments.
For example, xclim.ensembles.create_ensemble
has an argument called mf_flag
. This argument is a bool for definition, so the user can understand that it's a flag. Therefore, this argument can be called multifile
to be more readable.
Other example is xclim.core.convert_calendar
. It has 2 private functions (_yearly_interp_doy
, _yearly_random_doy
) that could be out of the main function. This makes the if statements easier to follow due to the code has a very large if statement.
Hi @jmunroe
We're planning to address all comments from the pyOpenSci review process for the next release (around 1~1.5 months away). There's no rush, but I just want to confirm that nothing in @aguspesce's comments runs counter to your preliminary evaluation/thoughts. Thanks!
I don't see anything in that review that I would disagree with. I'm trying to get my own review will be submitted within the week.
Thanks, @aguspesce, for your effort!
@Zeitsperre I was thinking of reviewing Xclim a bit after our reviewers. The more, the merrier 😄
Much appreciated!
Except for one or two more challenging fixes, we should be done addressing @aguspesce's comments in the next few days.
Hey there,
Just letting people know that the comments to date have all been addressed. The main development branch (and latest
documentation) reflect these changes. Our next release is slated for the end of March, so if more comments arrive this week, we should be able to address them next week, in time for the release.
Thanks again!
Hello @Zeitsperre, Thanks your responsiveness and communication on Xclim release schedule. I am following up with @jmunroe to give you the best estimate possible as to when you will have the second review.
I really appreciate your patience on this submission.
Hey @Zeitsperre, Thanks again for your patience. Due to unforeseen personal issues, the second review was delayed but is still on the way. Cheers!
Thank so much for your patience with me. I have now gone through the package carefully following the PyOpenSci review instructions. This package is an incredibly thorough treatment of climate based metrics and I found it a real pleasure to review. A few issue I identified are presented below but none of the them are serious and I recommend this package be accepted by PyOpenSci.
I hope that in acceptance by PyOpenSci (and hopefully by JOSS as well), xclim will continue to be extended and adopted by a wide international community of climate scientists and professionals. This is an important body of work that should empower those trying to understand, mitigate, and adapt to our changing climate.
Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
The package includes all the following forms of documentation:
However, please see the commments regarding installation documentation below.
Nice documentation! I really appreciated that the documenation was not just about the API for the function but included significant detail on what the actually calculation or metric did (with the associated reference).
Thanks for already moving
CONTRIBUTING.rst
to the top-level directory.
pyproject.toml
file or elsewhere.Readme file requirements The package meets the readme requirements below:
README.rst
file in the root directory.The README should include, from top to bottom:
NOTE: If the README has many more badges, you might want to consider using a table for badges: see this example. Such a table should be more wide than high. (Note that the a badge for pyOpenSci peer-review will be provided upon acceptance.)
The README does not include explicit installation instructions. It does refer to the package documentation which does contain installation instructions.
Not applicable for xclim
Descriptive links to all vingettes are not in the README, but they are easily findable in the included pacakge documentation.
- [x] Brief demonstration of package usage (as it makes sense - links to vignettes could also suffice here if package description is clear)
I am not aware of another Python package that does exactly the same thing xclim in the scientific ecosystem. There are potentially related packages (such as MetPy for analyzing weather data, or climpred for computing metrcs for earth systems forecasts) in the broader ecosystem that could be referenced in the documentation just in case a potential user would be better served by a different project.
Another potential Python package to reference is NCAR's GeoCAT (which is a part of NCAR overall strategy to migrate from away from NCL to Python). Historically, my understanding is non-Python tools like NCL were more widely used to do similar things to xclim.
All three of these packages (MetPy, climpred, and GeoCAT) are based on the same xarray/dask foundations like xclim.
Reviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole. Package structure should follow general community best-practices. In general please consider whether:
[x] Installation: Installation succeeds as documented.
[x] Functionality: Any functional claims of the software been confirmed.
[x] Performance: Any performance claims of the software been confirmed.
I focused on the worked examples provided in the documentation (consisting of ~10 .ipynb notebooks). xclim appears to work as described and the indicators and indices that I examined all looked reasonable and are very well documented.
I am unable to independently verify the correct implementation of every indicator/idice included with this software. However, the significant amount of detail in the documentation and the substantial references to peer-review literature gives me a high degree of confidence in the implementation.
[ ] Automated tests: Tests cover essential functions of the package and a reasonable range of inputs and conditions. All tests pass on the local machine.
Try to verify locally with
pytest
on Python 3.10. There were a few failed tests in a local dev environment (33 failed, 1359 passed). I have not attempted to debug any of these failing tests.I also note that on GitHub CI with more recent fixes that that the tag 0.40.0 I was reviewing, it appears the entire testing suite passes without failures.
I am more suspicious of my own development environment than xclim itself and I am satisfied if the testing suite passes on GitHub actions, as it appears to do.
$ pytest xclim --numprocesses=logical --durations=10 --cov=xclim --cov-report=term-missing
=============================================== short test summary info ================================================ FAILED xclim/testing/tests/test_atmos.py::TestWaterBudget::test_convert_units - AttributeError: 'Dataset' object has no attribute 'rsus'. Did you mean: 'rsds'? FAILED xclim/testing/tests/test_atmos.py::TestWaterBudget::test_nan_values - AttributeError: 'Dataset' object has no attribute 'rsus'. Did you mean: 'rsds'? FAILED xclim/testing/tests/test_atmos.py::TestUTCI::test_universal_thermal_climate_index - AttributeError: 'Dataset' object has no attribute 'rsus'. Did you mean: 'rsds'? FAILED xclim/testing/tests/test_atmos.py::TestPotentialEvapotranspiration::test_convert_units - AttributeError: 'Dataset' object has no attribute 'rsus'. Did you mean: 'rsds'? FAILED xclim/testing/tests/test_atmos.py::TestPotentialEvapotranspiration::test_nan_values - AttributeError: 'Dataset' object has no attribute 'rsus'. Did you mean: 'rsds'? FAILED xclim/testing/tests/test_atmos.py::test_wind_chill_index - AssertionError: FAILED xclim/testing/tests/test_cffwis.py::TestCFFWIS::test_fire_weather_ufunc_overwintering - AssertionError: FAILED xclim/testing/tests/test_ffdi.py::TestFFDI::test_ffdi_indicators[xlim-True] - KeyError: 'rh' FAILED xclim/testing/tests/test_ffdi.py::TestFFDI::test_ffdi_indicators[xlim-False] - KeyError: 'rh' FAILED xclim/testing/tests/test_ffdi.py::TestFFDI::test_ffdi_indicators[discrete-True] - KeyError: 'rh' FAILED xclim/testing/tests/test_ffdi.py::TestFFDI::test_ffdi_indicators[discrete-False] - KeyError: 'rh' FAILED xclim/testing/tests/test_indices.py::TestTG::test_simple[tg_mean-283.1391] - AssertionError: FAILED xclim/testing/tests/test_indices.py::TestTG::test_simple[tg_min-266.1117] - AssertionError: FAILED xclim/testing/tests/test_indices.py::TestTG::test_simple[tg_max-292.125] - AssertionError: FAILED xclim/testing/tests/test_precip.py::TestDaysWithSnow::test_simple - AssertionError: FAILED xclim/testing/tests/test_precip.py::test_days_over_precip_doy_thresh - AssertionError: FAILED xclim/testing/tests/test_precip.py::test_days_over_precip_thresh - AssertionError: FAILED xclim/testing/tests/test_precip.py::test_days_over_precip_thresh__seasonal_indexer - AssertionError: FAILED xclim/testing/tests/test_precip.py::test_fraction_over_precip_doy_thresh - AssertionError: FAILED xclim/testing/tests/test_precip.py::test_fraction_over_precip_thresh - AssertionError: FAILED xclim/testing/tests/test_precip.py::test_dry_spell - AssertionError: FAILED xclim/testing/tests/test_precip.py::test_dry_spell_frequency_op - RuntimeError: NetCDF: HDF error FAILED xclim/testing/tests/test_run_length.py::test_run_bounds_data[True] - AssertionError: FAILED xclim/testing/tests/test_run_length.py::test_keep_longest_run_data[True] - AssertionError: FAILED xclim/testing/tests/test_run_length.py::test_run_bounds_data[False] - AssertionError: FAILED xclim/testing/tests/test_run_length.py::test_keep_longest_run_data[False] - AssertionError: FAILED xclim/testing/tests/test_snow.py::TestSndMaxDoy::test_no_snow - AssertionError: FAILED xclim/testing/tests/test_temperature.py::TestWarmSpellDurationIndex::test_warm_spell_duration_index - AssertionError: FAILED xclim/testing/tests/test_temperature.py::TestFreezeThawSpell::test_freezethaw_spell_frequency - AssertionError: FAILED xclim/testing/tests/test_temperature.py::TestFreezeThawSpell::test_freezethaw_spell_mean_length - AssertionError: FAILED xclim/testing/tests/test_temperature.py::TestFreezeThawSpell::test_freezethaw_spell_max_length - AssertionError: FAILED xclim/testing/tests/test_temperature.py::test_corn_heat_units - AssertionError: FAILED xclim/testing/tests/test_sdba/test_properties.py::TestProperties::test_spatial_correlogram - AssertionError: =================== 33 failed, 1359 passed, 67 skipped, 2 xfailed, 96 warnings in 341.70s (0:05:41) ====================
[x] Continuous Integration: Has continuous integration setup (We suggest using Github actions but any CI platform is acceptable for review)
[x] Packaging guidelines: The package conforms to the pyOpenSci packaging guidelines. A few notable highlights to look at:
Both pylint and black are configured through GitHub actions for CI.
Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted.
The package contains a paper.md
matching JOSS's requirements with:
For the JOSS paper information, I am using the
joss_submission
tag from GitHub.
Estimated hours spent reviewing: 8 hours
The reviewer instructions appears leave it up to the reviewer on how to install the package. I cloned the repo:
git clone https://github.com/Ouranosinc/xclim.git
And then, to ensure I was looking at the same version as was submitted for review, I switched to the v0.40.0
tag:
git checkout tags/v0.40.0
Following the instructions at the xclim
documentation under Installation, I created a separate conda environment to install the required dependencies:
conda create -n my_xclim_env python=3.8 --file=environment.yml
conda activate my_xclim_env
pip install ".[dev]"
And there I hit my first issue:
CondaValueError: could not parse 'name: xclim' in: environment.yml
The fix (at least for conda 22.11.1) is that --file
is an option to pass to conda env create
and not conda create
. This needs to be fixed in the install instructions.
The instructions refer to 'Extra dependencies' such as flox
, SBCK
, eigen
, eofs
, pybind
. Since I used conda's environment.yml
the extras eofs
, eigen
, pybind11
were already included.
I confess I tend to get confused when there is the option of using either environment.yml
and requirements_*.txt
files. So, I skipped the instructions following 'Extra Dependencies' in the documentation.
I assume there must be situtations when I should and should not install these extra dependencies but as a new user of the package, I don't what those situations are yet.
Since theses installation instructions are right near the top of the documenation, perhaps it would be better for the maintainers to make those choices for me? For example, I am now wondering "should I be installing flox
?". Since it is 'highly recommended', would it not make more sense to have it as part of the default instructions?
To evaluate
I installed Jupyter lab and created a notebook to test the instructions given under the Basic Usage section of the documentation. The very first example
# ds = xr.open_dataset("your_file.nc")
ds = open_dataset("ERA5/daily_surface_cancities_1990-1993.nc")
ds.tas
My initial reading of this code made me think that this ERA5 dataset was something I need to first download locally (I did not distinguish between xr.open_dataset
and open_dataset
in my very first glance at the code).
After some review, I see now that there companion GitHub repo that was available that had testing data and the xclim.testing
API automatically makes a locally cached copy of this file. I think it would be clearer if this very first example was written out as
# ds = xr.open_dataset("your_file.nc")
ds = xclim.testing.open_dataset("ERA5/daily_surface_cancities_1990-1993.nc")
ds.tas
so that it was clear that the open_dataset
was utility method of the testing framework for xclim.
I liked how this initital documentation oriented the user quickly to the differences between 'indicators' and 'indices' which, especially for the new-user, may be something that could be confusing.
In the example of Health checks and metadata attributes there is a typo:
gdd = xclim.atmos.growing_degree_days(tas=ds6h.tas, thresh="10.0 degC", freq="MS")
should be
gdd = xclim.atmos.growing_degree_days(tas=ds6h.air, thresh="10.0 degC", freq="MS")
The final basic usage examples on Graphics are could be improved by adding some descriptions in the text of what those three visualizations are actually showing. This could be as brief as taking the comment from each code snippet and using that as a text description.
While in-code comments are generally fine, these last few examples on graphics feel tacked on given the strong narrative text established in the beginning of the Basic Usage section of the documentation.
Minor spelling error in the docs:
"Finally, xarray
is tightly in(t)egrated with dask
, a package that can automatically parallelize operations.
Under Subsetting and selecting data with xarray, the documentation reads:
Usually, xclim users are encouraged to use the subsetting utilities of the clisops package. Here, we will reduce the size of our data using the methods implemented in xarray
This is confusing because, as the first example workflow, the user has not yet been shown to use the clisops package. Should there be a subsub-section immediately before such as Subsetting and selecting data with cliops to demonstrate that recommended workflow?
Under Different ways of resampling there is setting of a matplotlib style
# import plotting stuff
import matplotlib.pyplot as plt
%matplotlib inline
plt.style.use("seaborn")
plt.rcParams["figure.figsize"] = (11, 5)
that leads to the warning
/tmp/ipykernel_7039/887583071.py:5: MatplotlibDeprecationWarning: The seaborn styles shipped by Matplotlib are deprecated since 3.6, as they no longer correspond to the styles shipped by seaborn. However, they will remain available as 'seaborn-v0_8-<style>'. Alternatively, directly use the seaborn API instead.
plt.style.use("seaborn")
I think the offending line should be changed to
plt.sytle.use("seaborn-v0_8")
(and elsewhere in the documenation where seaborn styles are used)
Since the example is supposed to contrast the differences between the two sampling methods, I think they should be on the same colorbar scale. There may be a more general solution for this with matplotlib, but explicitly setting the range is a quick fix:
hw_before.sel(time="2010-07-01").plot(vmin=0, vmax=7)
plt.title("Resample, then run length")
plt.figure()
hw_after.sel(time="2010-07-01").plot(vmin=0, vmax=7)
plt.title("Run length, then resample")
Under Spatially varying thresholds, the code comments showing the thresholds do not
# The tasmin threshold is 15°C for the northern half of the domain and 20°C for the southern half.
# (notice that the lat coordinate is in decreasing order : from north to south)
thresh_tasmin = xr.DataArray(
[7] * 24 + [11] * 24, dims=("lat",), coords={"lat": ds5.lat}, attrs={"units": "°C"}
)
# The tasmax threshold is 16°C for the western half of the domain and 19°C for the eastern half.
thresh_tasmax = xr.DataArray(
[17] * 24 + [21] * 24, dims=("lon",), coords={"lon": ds5.lon}, attrs={"units": "°C"}
)
don't appear to match the values used in the code. I assume the code comments just need to be updated.
Only because the utility of the rest of the package is so high, I was a bit confused with the need to manually create a 2D array of criteria (values) and realizations (runs/simulations). While I think I understand why all dimensions/variables, other the realization id, need to be flattened into a 2D array to apply an ensemble reduction technique, would it be possible to create a helper function that interates through all non-realization 'dimensions', and all variables to generalize the code snippet given below?
# Create 2d xr.DataArray containing criteria values
crit = None
for h in ds_crit.horizon:
for v in ds_crit.data_vars:
if crit is None:
crit = ds_crit[v].sel(horizon=h)
else:
crit = xr.concat((crit, ds_crit[v].sel(horizon=h)), dim="criteria")
crit.name = "criteria"
Is this "criteria" array effectively the equivalent of creating a feature matrix used in data science?
Since, at the moment, "all frequency analysis functions are hard-coded to operate along the time
dimension", perhaps a brief reference to xr.DataArray.rename
on how to rename a dimension in xarray would be useful?
I really liked how the worked example for xclim.indices.stats.frequency_analysis
show both how to use it and fully explains the steps that are bundled together in that convenience function.
This level of "developer documentation" is well explained in the "user manual" section of the examples documentation clearly shows that xclim has been written in a away that is extensible. My previous experience with some other packages that are designed to be extended or have some sort of "plug-in" architecture, the documentation of how to actually use that extensibility is buried in the internal developer-focused documentation (if it exists at all). The authors of xclim should be commended for their clarity of presentation here that will allow others to extend the indicator/indice framework introduced by xclim.
The content is clear and well explained. Just a couple of very small typos that could be fixed:
A more complex example could have bias distribution varying strongly across months. To perform the adjustment with different factors for each months, one can pass group='time.month'. Moreover, to reduce the risk of sharp change in the adjustment at the interface of the months, interp='linear' can be passed to adjust and the adjustment factors will be interpolated linearly. Ex: the factors for the 1st of May will be the average of those for April and those for May.
and in the following notebook on advanced SDBA:
The previous notebook covered the most common utilities of xclim.sdba for conventional cases
Again, kudos for the depth and richness of examples presented here.
This is really neat to see this functionality so accessible.Soon after completing this review, I definitely have to run this kind of computation on places that I have personal connections with. And with xclim, I think that will be straightforward to do!
Hello @Batalex and @jmunroe
I hope you both are well! We managed to hammer out as many of the smaller review comments as we could, and have marked the comments concerning the behaviour of frequency_analysis
as something that necessitates some deliberation in the coming months.
This has all been merged to our main development branch in time for our latest release (version 0.42.0)
If you could please confirm that these changes meet your expectations, we would greatly appreciate it. Please let us know if there are any other modifications required.
Best,
Hello @Zeitsperre, I am impressed by Xclim's reactivity!
In addition to Agustina and James' remarks, I'd like to add a little comment. We recommend separating unit tests from source code or, at least, excluding them from the final build.
As for the review, @aguspesce, @jmunroe, could you please make a final check as to whether you are satisfied with how your comments were dealt with? You can go through the changelog, the Xclim's team was nice enough to highlight the changes brought by this submission.
Once you are done, you may check the The author has responded to my review and made changes to my satisfaction. I recommend approving this package.
item in your review.
In addition to Agustina and James' remarks, I'd like to add a little comment. We recommend separating unit tests from source code or, at least, excluding them from the final build.
Addressing this comment would be a bit tricky; As a second layer of caution, we currently run tests on the sdist
when building our binary packages on conda-forge. Previously, using setuptools
, we could more easily exclude the xclim/testing/tests
folder when building the wheel
, but it seems like the new backend we are using (flit
) doesn't easily allow that level of control.
I'll open an issue at https://github.com/pypa/flit to see if that feature is maybe just not documented (i.e. Optional exclusion of files/folders depending on build distribution (sdist
|bdist_wheel
)). If this is blocking acceptance to pyOpenSci, I can try getting this done for the next release.
Thanks again!
Hi everyone! 👋 this has been a wonderful review to watch unfold!
A note about tests - for reference, here is our language around testing in our packaging guide.
For many scientific packages, it is important to have tests included in distribution archives. This is particularly important if you have complex wheel builds supporting multiple architectures, etc.. In this case it sounds like it's important to the conda-forge build. While we suggest that unit tests live outside of the package directory, we do not require it.
We suggest tests live outside of the package directory to prevent users from running tests on the package file tree rather than in the installed version. And also it makes it easy to exclude tests from distributions that do not need to have tests included.
i wanted to add this note only because it is NOT a hard requirement for pyOpenSci acceptance. We do understand that in some cases tests may need to be a part of the distribution and moving them might be significant burden to the maintainers.
it would be interesting to see if there is a bit more flexibility with hatchling or pdm-build as a backend. i'm not 100% sure as i haven't tried to exclude specific parts of the package dir with them. And, I don't want to cause anyone here more / unnecessary work! So if this request is a pain point for xclim, then it is ok to move forward without this specific change.
Just for information gathering purposes, i am going to ask the hatchling dev about that specific feature as well. :)
in case this is helpful - i asked the hatchling dev about excluding a directory here feel free to chime in IF you have questions or need additional support in making this switch (if you decide to make it).
Do not worry, this is but a minor comment. This is definitely not blocking the submission, especially if it means reworking the build system to remove 500kB of the sdist
@lwasser
Thanks for looking into things with the hatch
backend. Ultimately, I really like the simplicity of flit
, so I've gone with the approach of splitting the tests
folder out of the package. After factoring the functions that were mixed into the tests themselves back into the package, the wheel
is significantly smaller (~140 KB!).
Assuming that changes are accepted, this will be part of the next scheduled release (v0.43.0) in around a month.
Thanks again!
ok thank you for making that structure change @Zeitsperre !! i totally get it. a lot of folks love the simplicity of flit. it's good for us to know what each tool can and can't do so that is why i thought i'd dig just a bit to understand more :)
and yay for a smaller wheel! PyPI / warehouse maintainers will appreciate that! ✨
Many thanks, @Zeitsperre, I am once more impressed with the team 😍
I've reviewed the changes made by the authors.
Final approval (post-review)
Thanks James, how about you @aguspesce? Are you satisfied with the changes the team made?
Sorry @Batalex for the delay! I am happy with the changes and have added the tick to my review.
🎉 Xclim has been approved by pyOpenSci! Thank you @Zeitsperre for submitting Xclim and many thanks to @aguspesce and @jmunroe for reviewing this package! 😸
There are a few things left to do to wrap up this submission:
[![pyOpenSci](https://tinyurl.com/y22nb8up)](https://github.com/pyOpenSci/software-review/issues/73)
.It looks like you would like to submit this package to JOSS. Here are the next steps:
🎉 Congratulations! You are now published with both JOSS and pyOpenSci! 🎉
All -- if you have any feedback for us about the review process please feel free to share it here. We are always looking to improve our process and documentation in the contributing-guide. We have also been updating our documentation to improve the process, so all feedback is appreciated!
On a more personal note, I was delighted to be part of this process and learned a lot as an editor. I wish I had more time to give a more in-depth review, but I could not be happier with the quality of Agustina and James' work.
I have one final task for you @aguspesce, @jmunroe. We would appreciate it if you could take the post-review survey to improve our process. I will send you the link by DM. Thank you all 🤗
This is excellent news! Thanks to all the reviewers and to @Zeitsperre for preparing and coordinating the xclim submission. Great job.
wow!! so awesome!! congrats @Zeitsperre, @tlogan2000, @aulemahal !! we also welcome any feedback from anyone here on this process!
@Batalex incredible work leading this review!! 🙌 and many thanks @aguspesce @jmunroe for the excellent reviews!
NOTE: i also posted about this on the pangeo discourse!
This is such fantastic news! Thanks so much to @lwasser, @NickleDave, @Batalex, @aguspesce, and @jmunroe for all of your effort and time guiding us to build a better package!
I was scheduled to deliver a brief internal presentation on news concerning xclim today, so I'll be sharing this news as well :)
We should be opening pull requests soon and adding all associated information and badges to xclim's README and documentation either today or tomorrow, and following up with JOSS by the end of the week. Again, thanks so much for all your work!
@all-contributors please add @Zeitsperre for code please add @tlogan2000 for code please add @aulemahal for code please add @aguspesce for review please add @jmunroe for review
@Zeitsperre, @tlogan2000, @aulemahal
We would love to have you on our Slack if you'd like!
Give me a shout, and we will do so
@Batalex Sure! Feel free to add me! (email(s) are in https://github.com/Ouranosinc/xclim/blob/master/AUTHORS.rst)
@all-contributors please add @aulemahal for code please add @aguspesce for review please add @jmunroe for review
Plz, be a good bot so I can stop spamming people
Submitting Author: Trevor James Smith (@Zeitsperre) All current maintainers: (@Zeitsperre, @tlogan2000, @aulemahal) Package Name: xclim One-Line Description of Package: Climate indices computation package based on Xarray Repository Link: https://github.com/Ouranosinc/xclim Version submitted: v0.40.0 Editor: @Batalex Reviewer 1: @jmunroe Reviewer 2: @aguspesce Archive: JOSS DOI: Version accepted: v0.42.0 Date accepted (month/day/year): 04/11/2023
Description
xclim
is an operational Python library for climate services, providing numerous climate-related indicator tools with an extensible framework for constructing custom climate indicators, statistical downscaling and bias adjustment of climate model simulations, as well as climate model ensemble analysis tools.xclim is built using
xarray
_ and can seamlessly benefit from the parallelization handling provided bydask
. Its objective is to make it as simple as possible for users to perform typical climate services data treatment workflows. Leveraging xarray and dask, users can easily bias-adjust climate simulations over large spatial domains or compute indices from large climate datasets.Scope
- Who is the target audience, and what are scientific applications of this package?
xclim
aims to position itself as a climate services tool for any researchers interested in using Climate and Forecast Conventions compliant datasets to perform climate analyses. This tool is optimized for working with Big Data in the climate science domain and can function as an independent library for one-off analyses in Jupyter notebooks or as a backend engine for performing climate data analyses over PyWPS (e.g. Finch). It was primarily developed targeting earth and environmental science audiences and researchers, originally for calculating climate indicators for the Canadian government web service ClimateData.ca.The primary domains that xclim is built for are in calculating climate indicators, performing statistical correction / bias adjustment of climate model output variables/simulations, and in performing climate model simulation ensemble statistics.
- Are there other Python packages that accomplish the same thing? If so, how does yours differ?
icclim is another library for the computation of climate indices. Starting with version 5.0 of icclim, some of the core computations rely on xclim. See explanations about differences between xclim and icclim.
scikit-downscale is a library offering algorithms for statistical downscaling.
xclim
drew inspiration for its fit-predict architecture. The suite of downscaling algorithms offered differ.Technical checks
For details about the pyOpenSci packaging requirements, see our packaging guide. Confirm each of the following by checking the box. This package:
Publication options
JOSS Checks
- [x] The package has an **obvious research application** according to JOSS's definition in their [submission requirements][JossSubmissionRequirements]. Be aware that completing the pyOpenSci review process **does not** guarantee acceptance to JOSS. Be sure to read their submission requirements (linked above) if you are interested in submitting to JOSS. - [x] The package is not a "minor utility" as defined by JOSS's [submission requirements][JossSubmissionRequirements]: "Minor 'utility' packages, including 'thin' API clients, are not acceptable." pyOpenSci welcomes these packages under "Data Retrieval", but JOSS has slightly different criteria. - [x] The package contains a `paper.md` matching [JOSS's requirements][JossPaperRequirements] with a high-level description in the package root or in `inst/`. - [x] The package is deposited in a long-term repository with the DOI: https://doi.org/10.5281/zenodo.2795043 *Note: Do not submit your package separately to JOSS*Are you OK with Reviewers Submitting Issues and/or pull requests to your Repo Directly?
This option will allow reviewers to open smaller issues that can then be linked to PR's rather than submitting a more dense text based review. It will also allow you to demonstrate addressing the issue via PR links.
Code of conduct
Please fill out our survey
P.S. *Have feedback/comments about our review process? Leave a comment here
Editor and Review Templates
Editor and review templates can be found here