eurec4a / how_to_eurec4a

Code examples to get you started with EUREC⁴A data.
https://howto.eurec4a.eu
MIT License
6 stars 20 forks source link

requirements cleanup #27

Closed MeraX closed 3 years ago

MeraX commented 3 years ago

From searching though the code, I found that only the following packages are first order requirements:

Maybe one could add intake but that is commented out. At the moment, I don't even see code importing Numpy, so no need for that.

d70-t commented 3 years ago

I agree in general, that the requirements should be as minimal as possible and we should revisit the requirement files to check if we can get rid of some more.

One thing which makes requirements in this particular case a bit more complicated is that the amount of libraries required to handle an intake catalog in principle is small (and e.g. eurec4a pulls in the general intake to be able to return a catalog). However to actually access a catalog which is served via HTTP, intake will additionally require aiohttp and requests. Then if you like to use datasets which are served via opendap, you'll need either pydap or netCDF and the decision is made by the catalog authors. If you use datasets stored as zarr, you'll need the zarr library as well and if the zarr is served via S3-compatible API, you'll additionally need s3fs or if it is coming via IPFS, you'll need ipfsspec.

All of those additional dependencies will not be imported directly by user code and I'd argue that it is a good choice from intake not to require all the additional libraries which could be required. However a result of this choice, the requirements are actually dependent on what's inside the catalog and which items of the catalog you are using.

MeraX commented 3 years ago

That is a good point. I think it would than make sense to mention these hidden dependencies in the general introduction and note again the exact packages that are necessary for each example at their top.

Perhaps you/we should postpone this issue until the content of the notebooks stabilizes. First finish the content and then keep track of the dependencies while testing each notebook individually on a vanilla system trying to find minimum requirements for each.

d70-t commented 3 years ago

Yes, there'll be some fluctuations fro a while. At the moment, I am trying to finalize #25 and then plan to move from tmieslinger/how_to_eurec4a to eurec4a/how_to_eurec4a in order to provide a more stable location and to open up the reviewing process to a larger community.

MeraX commented 3 years ago

intake.gui at the end of show_intake_catalog tells me that it requires panel>=0.8.0

d70-t commented 3 years ago

Good catch. I was wondering if it is a good idea to include it into the book. The gui is kind of useless in a static book, as it requires interactive communication with a running python kernel in the background. However it might be good have the requirements installed and write more about this limitation, such that one can just run it within a binder environment. @MeraX what do you think about this?

MeraX commented 3 years ago

I tried the GUI. As far as I could see, it is not much more useful than the tree function (at the moment). If it stays like this, I would mention this requirement directly at the example and keep the example as an advertisement of advanced Intake magic.

However, the GUI has this interesting 📊 plot button. This would require some additional meta data in intake to define some default plots. If we would support this feature, than the gui might be a good thing. Having the meta data could perhaps also enable some .plot() magic on Intake catalog items. In addition to adding the metadata, the plotting features of intake would require HvPlot, which is an interesting plotting alternative in itself.

d70-t commented 3 years ago

:+1: thus I'll just add the >=0.8.0 to the text before the commented-out intake.gui.

I also really like the idea of automatic plots and we should really look into this more. I just didn't spent enough time on HvPlot yet to fully figure this out. If you already have some ideas, it would be great if you could introduce some first plots into the eurec4a-intake.

MeraX commented 3 years ago

This is what this GUI feature could look like. If you define a custom plot, you can get very simple quick plots. However, I dind't find an option to select a time slice such that you will always load the full time series, which is slow, depending on the dataset. Actually I was expecting all radiometer channels in different colors in my example plot but due to testing different stuff and data getting stuck in caches, we can only see one channel. Screenshot from 2021-02-24 17-50-54

In principle, I like our slightly customized examples in this book more. Intake plots are probably a feature that we might explore later. in more detail.