cormorack / yodapy

Your Ocean Data Access in Python
MIT License
9 stars 10 forks source link

auto install latest yodapy in JHub notebook #120

Open robfatland opened 4 years ago

robfatland commented 4 years ago

See the question I placed on "Implications" on the yodapy main README after the "how to install" sections: Is it worth putting an install-latest command in a .cshrc of a Jupyter notebook pod?

lsetiawan commented 4 years ago

I don't quite get what you are trying to do here? Install yodapy in a jupyterhub user space, and make sure it stays around when that user space is shutdown?

robfatland commented 4 years ago

yeah; which it won't unless you just re-install it next time you start up :)

lsetiawan commented 4 years ago

If you have access to the underlying environment yaml for the JHub, you can always install it there if you need this package all the time.

robfatland commented 4 years ago

Sure; I have in mind more the scientist who is a "customer" of the JHub; who does not necessarily want to get caught up in the mechanics of the service but who does want to have access to the latest and greatest yodapy; but without having to "remember to install" every time they log in. This could also be an Issue for the pangeo gang; i.e. don't feel obliged to solve it :)

robfatland commented 4 years ago

I will do a PR on the yodapy documentation to follow up on this :) We have the essential framework for how to install and use yodapy so the narrative has to do a bit more. For example we need to establish what the credential file is, where it lives, why it is not in a repo, and how to tell if it is created properly. In my case I have installed yodapy with no errors on my Windows PC.

The first line of Python code to run is:

from yodapy.utils.creds import set_credentials_file

This gives a huge error message on the first try; like the one given below. There is no error on the second try. Next I run

set_credentials_file(data_source='ooi', username='OOIAPI-XXXXXXXXXXXXXX', token='XXXXXXXXXXX')

This completes with no error but there is no evidence of a .yodapy directory that would hold the credentials. Perhaps this is a write permission problem on Windows.

Finally we are to run

from yodapy.datasources import OOI

This produces an error:

ImportError                               Traceback (most recent call last)
<ipython-input-1-f64d6c3c5dd1> in <module>
----> 1 import yodapy

~\Anaconda3\lib\site-packages\yodapy\ in <module>
      6 )
----> 8 from yodapy import datasources
     10 from ._version import get_versions

~\Anaconda3\lib\site-packages\yodapy\datasources\ in <module>
      8 )
---> 10 from yodapy.datasources.ooi.CAVA import CAVA
     11 from yodapy.datasources.ooi.OOI import OOI  # noqa

~\Anaconda3\lib\site-packages\yodapy\datasources\ooi\ in <module>
----> 1 from yodapy.datasources.ooi.OOI import OOI

~\Anaconda3\lib\site-packages\yodapy\datasources\ooi\ in <module>
     25 from yodapy.datasources.ooi.CAVA import CAVA
---> 26 from yodapy.datasources.ooi.helpers import set_thread
     27 from yodapy.utils.conn import (
     28     download_url,

~\Anaconda3\lib\site-packages\yodapy\datasources\ooi\ in <module>
     18 import xarray as xr
---> 20 from yodapy.utils.conn import requests_retry_session
     21 from yodapy.utils.parser import get_nc_urls

~\Anaconda3\lib\site-packages\yodapy\utils\ in <module>
     33     from echopype.model import EchoData
     34 else:
---> 35     from echopype.model import EchoDataEK60 as EchoData
     37 logger = logging.getLogger(__name__)

ImportError: cannot import name 'EchoDataEK60' from 'echopype.model' (C:\Users\kilro\Anaconda3\lib\site-packages\echopype\model\ 
lsetiawan commented 4 years ago

Did you install from the master build?

lsetiawan commented 4 years ago

From your command above it appears that you didn't. Echopype got update recently, and I recently updated this dependency but haven't made a release.

robfatland commented 4 years ago


Thanks Don :) This works now "almost" with one Deprecation Warning along the way and a "save to NetCDF" problem at the very end. As a side note: I do not see how to confirm that credentials are stored outside the repo directory. The tail end issue is that the xarray Dataset write to as a local netcdf file konks out. See the note at the bottom for more on this.


Direct install: Success.

First import and set_credentials_file(): No errors! import OOI also no error. got one instrument stream. So far so good.

ooi.view_instruments() got a Deprecation Warning.

ooi.data_availability() defaulted to 8 threads, produced a small table for output.

ooi.request_data() seems to work; after a short interval (1 minute) the .check_status() returns completed; plus a nice URL where I find an 8MB NetCDF result file and some ancillary content.

ds=ooi.to_xarray() completes with 86398 data points, two less than a complete day but who's counting?

Note There is one issue with this data. It is a list with one element; so if I try this:

fred = ds[0]

I get a serialization warning and a long error message. Incidentally the type of fred is xarray.core.dataset.Dataset...

C:\Users\kilro\Anaconda3\lib\site-packages\xarray\ SerializationWarning: variable preferred_timestamp has data in the form of a dask array with dtype=object, which means it is being loaded into memory to determine a data type that can be safely stored on disk. To avoid this, coerce this variable to a fixed-size dtype with astype() before saving it.
AttributeError                            Traceback (most recent call last)
<ipython-input-43-7cbc3e553457> in <module>
----> 1 fred.to_netcdf('')

~\Anaconda3\lib\site-packages\xarray\core\ in to_netcdf(self, path, mode, format, group, engine, encoding, unlimited_dims, compute, invalid_netcdf)
   1534             unlimited_dims=unlimited_dims,
   1535             compute=compute,
-> 1536             invalid_netcdf=invalid_netcdf,
   1537         )

~\Anaconda3\lib\site-packages\xarray\backends\ in to_netcdf(dataset, path_or_file, mode, format, group, engine, encoding, unlimited_dims, compute, multifile, invalid_netcdf)
   1069         # to be parallelized with dask
   1070         dump_to_store(
-> 1071             dataset, store, writer, encoding=encoding, unlimited_dims=unlimited_dims
   1072         )
   1073         if autoclose:

~\Anaconda3\lib\site-packages\xarray\backends\ in dump_to_store(dataset, store, writer, encoder, encoding, unlimited_dims)
   1115         variables, attrs = encoder(variables, attrs)
-> 1117, attrs, check_encoding, writer, unlimited_dims=unlimited_dims)

~\Anaconda3\lib\site-packages\xarray\backends\ in store(self, variables, attributes, check_encoding_set, writer, unlimited_dims)
    293         variables, attributes = self.encode(variables, attributes)
--> 295         self.set_attributes(attributes)
    296         self.set_dimensions(variables, unlimited_dims=unlimited_dims)
    297         self.set_variables(

~\Anaconda3\lib\site-packages\xarray\backends\ in set_attributes(self, attributes)
    310         """
    311         for k, v in attributes.items():
--> 312             self.set_attribute(k, v)
    314     def set_variables(self, variables, check_encoding_set, writer, unlimited_dims=None):

~\Anaconda3\lib\site-packages\xarray\backends\ in set_attribute(self, key, value)
    427             self.ds.setncattr_string(key, value)
    428         else:
--> 429             self.ds.setncattr(key, value)
    431     def encode_variable(self, variable):

netCDF4\_netCDF4.pyx in netCDF4._netCDF4.Dataset.setncattr()

netCDF4\_netCDF4.pyx in netCDF4._netCDF4._set_att()

netCDF4\_netCDF4.pyx in netCDF4._netCDF4._ensure_nc_success()

AttributeError: NetCDF: String match to name in use