holoviz-topics / EarthML

Tools for working with machine learning in earth science
https://earthml.holoviz.org
BSD 3-Clause "New" or "Revised" License
94 stars 21 forks source link

Added intake to Walker Lake example. #11

Closed mmccarty closed 6 years ago

mmccarty commented 6 years ago

This PR adds Intake to the Walker Lake example. This should not be merged until the PR to add caching to the intake-xarray plugin is merged.

jlstevens commented 6 years ago

Looks good to me!

mmccarty commented 6 years ago

Thanks @jlstevens The required changes to intake-xarray have be merged. We can merge this one now as well.

jlstevens commented 6 years ago

@mmccarty I've checked out the branch to take a closer look..

It generally looks good although I would recommend making the following three changes before merging:

  1. The line l5.cache[0].get_metadata(urlpath) prints out quite a bit of info that I don't think is needed elsewhere in the notebook and doesn't help readability. Is there something in particular you wish to draw the reader's attention to here? If not, I would either comment out this line or remove it entirely.

  2. It would be good to add some instructions on how to install the necessary version of intake (including the rasterio plugin). Right now we are using the standard pyviz environment to specify our dependencies and I suspect we will want to specify intake and useful plugins there. What do you think @jbednar ? Otherwise we can start declaring these dependencies in a way that is specific to this project.

  3. Could you clear the notebook of data and metadata? This helps when viewing notebook diffs on github and helps keep the repo size down.

That's all I can think of for now!

jbednar commented 6 years ago
  1. I agree; by default there should either be no output from the file-reading commands or at most a single line that indicates either that the file is being downloaded from a URI or loaded from a cached path. A comment or Markdown note can explain how to get more info.
  2. Yes, we need to add intake and intake-xarray to the pyviz envt. We'll need a new pyviz envt soon, but the earliest it could arrive is probably a week from now, and realistically two weeks. In the meantime, there needs to be a new Installation page in getting started, and that can just list the packages we need (conda install -c pyviz pyviz intake intake-xarray?).
  3. Agreed.
mmccarty commented 6 years ago

@jlstevens I can remove 1. 2. I believe Jim answered. 3. No problem. @jbednar With the default log level in intake, you won't get any output. To get output, users can change the log level. I'll add that to the notebook.

The instructions to get the notebook dependencies would be conda install -c intake intake intake-xarray pip install rasterio

Where should these go?

jbednar commented 6 years ago

This project should mirror EarthSim's website, so it would be in a new notebook examples/getting-started/index.ipynb. (EarthSim still has index.rst instead of a notebook, but now that index.ipynb is supported we need to update that as I want all docs to be notebooks so that users can explore them in their local copy (and to avoid us having to context-switch between Markdown and .rst).

mmccarty commented 6 years ago

@jbednar @jlstevens PR updated.

jlstevens commented 6 years ago

All looks fine to me.

mmccarty commented 6 years ago

@jbednar @jlstevens Anything else?

jlstevens commented 6 years ago

You could also strip the metadata from index.ipynb :-)

mmccarty commented 6 years ago

@jlstevens done

ebo commented 6 years ago

Hmmm... just broke away a few minutes to play with the intake version updates and got several warnings along the lines of:

/home/jldavid3/anaconda3_2/lib/python3.6/importlib/_bootstrap.py:219: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88 return f(*args, **kwds)

and errors:intekes

TypeError: init() got an unexpected keyword argument 'concat_dim'

One of my packages are likely out of sync with what is expected, but I never would have suspected that numpy.dtype would have changed...

mmccarty commented 6 years ago

@ebo intake-xarray needs to be updated. I'm going to tag and build a new version so you don't have to install with dev.

ebo commented 6 years ago

On Aug 20 2018 11:53 AM, Mike McCarty wrote:

@ebo intake-xarray needs to be update. I'm going to tag and build a new version so you don't have to install with dev.

Ahhh... that makes sense. Sorry I missed the dev instructions. That said it was probably a decent reminder that it needed to be updated ;-)
I hope to get caught up enough with the current insanity so I can get back to this.

jlstevens commented 6 years ago

@mmccarty Given the slow download (at least the first time you run the notebook), it might be worth enabling logging by default until intake offers a download progress bar.

mmccarty commented 6 years ago

@jlstevens done.

mmccarty commented 6 years ago

@ebo you will want an updated conda environment.

conda install -c pyviz pyviz
conda install -c intake intake intake-xarray
conda install -c conda-forge rasterio s3fs zarr
mmccarty commented 6 years ago

Also, make sure you're on xarray 0.10.8.

jbednar commented 6 years ago

What happens if you omit "-c conda-forge"?

ebo commented 6 years ago

On Aug 20 2018 1:36 PM, James A. Bednar wrote:

What happens if you omit "-c conda-forge"?

are you talking to me or Mike?

mmccarty commented 6 years ago

@jbednar Omitted "-c conda-forge" where I could (only zarr requires it). I've asked the distribution team to add zarr to the default channel.

@ebo Jim was talking to me. :-)

jbednar commented 6 years ago

Right; that was for Mike. Thanks, Mike!

jlstevens commented 6 years ago

I've tested this notebook and it works with the current set of instructions. Once zarr is on defaults, we can update the instructions accordingly.

As everything looks good to me now, I'll go ahead and merge.

ebo commented 6 years ago

On Aug 20 2018 2:17 PM, Mike McCarty wrote:

@jbednar Omitted "-c conda-forge" where I could (only zarr requires it). I've asked the distribution team to add zarr to the default channel.

@ebo Jim was talking to me. :-)

ahhh... Just checking.