Closed santisoler closed 5 years ago
@leouieda would you mind checking this code?
I have one minor doubt: Should import the DATASETS
list in the __init__.py
in order to make it accesible to users?
So, if one want to load every dataset could do something like:
from rockhound import fetch_bedmap2, BEDMAP2_DATASETS
grid = fetch_bedmap2(BEDMAP2_DATASETS)
Note: loading every dataset could eat a lot of memory!
@santisoler looks great! Just a few suggestions for making the code a bit cleaner.
I don't think we should expose the
DATASETS
variable. That's a very narrow use case. What we could do is allowdatasets="all"
to be a valid input.
I've been messing around with the datasets="all"
argument with the idea of making it the default value, but due to the high memory requirement I dropped it. But leaving the datasets="all"
as an option would be good. I'll add it.
But we can't load some of the grids into the same
Dataset
because they have different coordinates and shapes, right?
Thanks to xarray
we could get grids with different shapes on the same Dataset
with the xarray.merge()
function. The coordinates are spanned to include the coordinates of every DataArray
and the values are filled with nans if no data is available.
Don't know if we should only merge DataArray
s with the same shape or leave the code as it is now.
@leouieda I'll apply the requested changes tomorrow. Thanks for the review!
Don't know if we should only merge
DataArray
s with the same shape or leave the code as it is now.
That's a tough one. Maybe we can leave as is but include a warning the docstring that you'll get different things if you ask for only the uncertainty grids or for all of them.
due to the high memory requirement I dropped it
This should definitely be a warning in the docstring. We don't want people crashing their computers. And it should definitely not be the default :smiley:
@leouieda I think the code and its docstring look better now.
I've noticed that the only the dataset thickness_uncertainty_5km
is defined on a different set of points than the rest, so we don't have to bother with the lakemask_vostok
one.
That's why the warning only reference the former as source of problems.
Do you like the hard-coded metadata as it is?
:+1: I really like the changes and the docstring! Good job!
What do you think about removing the all
option? It feels like something that will be abused and cause system crashes often. Plus, we can't test it on the CIs.
@leouieda Thanks for the reviews.
I think I've added and removed the all
option a hundred times, what leads me to think we are not entirely sure if it worth the risk of crashing local systems and/or Travis (is it possible?).
We definitely should remove it but leave the option to pass the datasets are strings (instead of putting a single dataset inside a list). In fact I don't imagine a case in which someone would need the entire set.
I'll remove it from the code, the docstring and the test functions. Then if no CI fails we could merge.
:+1: thanks, Santi!
We still need a gallery plot for this. Given the size of the grids, probably only plot a subsection, like we did for the ETOPO1 grid.
Thanks for the reviews @leouieda! I can get into the gallery plot this week. I'll open a PR when ready.
Bedmap2 is a suite of gridded products describing surface elevation, ice-thickness, the sea floor and subglacial bed elevation of the Antarctic south of 60°S. The Bedmap2 datasets are downloaded from the British Antarctic Survey and loaded as
xarray.Dataset
objects.Reminders
make format
andmake check
to make sure the code follows the style guide.doc/api/index.rst
.