Open coxipi opened 8 months ago
How breaking of a change would it be to the existing SDBA tests? Would you be willing to update all associated tests?
As far as tests are concerned, I think only test_stack_variables
would need to be changed.
Also, sdba{_advanced}.ipynb
assume we only work with pr, tasmax
, in this case I would propose to simply drop tasmin
from the start and keep the same structure.
It's not really related to tests, so maybe I should avoid complicating the structure. I was just using this to test methods for R&D projects, so it's maybe not a real need for xclim_testdata.
I ended up downloading a few lon/lat points from datasets on the server. I would be happy to add tasmin
datasets in the test datasets if we judge we should have this, but I'm not sure if this is necessary.
When checking if adding tasmin
would pose problems with xclim, I realized that the datasets in xclim-testdata are intended for tests (duh), so my use case of trying methods for new bias corrections is not what we intend to cover with xclim-testdata (correct?). For instance, it would also have been useful for me to have access to radiation datasets, but I don't think we want this if we are to keep xclim_testdata minimal.
So, if you guys agree, feel free to close the issue, if not I can add the tasmin
datasets.
Indeed, the point of this repo is mainly for testing. But I use it too for development as it gives an easy access to well-formatted data.
As for adding variables to the SDBA test datasets, I think it would make sense once we have multivariate methods to test. Currently, tasmin
would be no use because in an univariate context you could just do tasmin = ds.tasmax.rename('tasmin')
. But in a mutlivariate context, it would be interesting to have "real-life" data with all the noisy correlation and other perks.
I see this repo as something that serves both the unit tests and the performance tests of xclim (e.g. xclim-benchmark
). If your changes to the testing data can lead to more rigorous testing in either or both of those aspects, I see no issues with you moving forward.
If you see the additional variables as only useful for benchmarks, you could always add a new folder here and add your modified test data into there. By default, xclim's testing code only downloads the files it needs to run its tests (set by a list of file locations in xclim/testing/helpers.py
).
Unless you're adding a significant amount of data (>3 MiB would be significant), my vote is to go forwards with your idea.
I think it would make sense once we have multivariate methods to test.
We do some multivariate tests in sdba.ipynb
, but for now tasmax / pr
is sufficient for our needs.
serves both the unit tests and the performance tests of xclim
This would be neither of those currently.
Unless you're adding a significant amount of data (>3 MiB would be significant)
Ok, yes I was thinking something the same size as the current sdba
datasets. I'll go forward with this when I have a down time.
It's been a couple times I would have liked to have datasets (at least two simulations) with
tasmax, tasmin
variables. Something like datasets in sdba (which havetasmax, pr
).Should I add new datasets with
tasmax/tasmin
, or update existing datasets in sdba with a 3rd variable?