Closed alaws-USGS closed 1 year ago
@gzt5142 these are great ideas and I think using the %run
magic would be the more effective measure. The idea of a setup notebook was an earlier idea we had for getting users into the tutorials. However, I'm questioning whether we still need a setup notebook based on other notebooks by @rsignell-usgs, the ARC HPC staff, or even project pythia?
I think it would be helpful if we sorted out a single system that we all can get behind. It might even rise to ADR level;
It seems that we do want to consolidate some functions (which I have been calling 'helpers') into a single source that all other notebooks can use. Advantages for code re-use, maintenance, and also helps readability for notebooks.
The main helpers that I've been using are cluster_config()
and stop_running_cluster()
-- both on the esip/qhub node.
In terms of the usage pattern, options are:
sys.path
in the notebook or we set it for them by manipulating their python environment (edits to .bashrc
e.g.).
from HyTEST.helpers import configure_cluster()
sys.path
manipulation, if not done correctly, can throw cryptic exceptions. %run
or %load
as described above to execute the code block rather than an import.
Thoughts? I think we need to pick a path and use it across the board.
Gang, I think these challenges must be widespread throughout the community (using different dask clusters, working on prem and on cloud) @gzt5142 would you be willing to ask this on the Pangeo Discourse? along with your ideas of how to resolve of course!
would you be willing to ask this on the Pangeo Discourse? along with your ideas of how to resolve of course!
I can ask, sure..... in my mind, this is actually two questions: 1) How to automate cluster or other environmental configurables programatically. 2) How do we want to include such code for all notebooks and users? (module vs magic)
I am still in the throes of debugging and configuring my new hardware and credentials... will look at the pangeo forum in the morning, likely.
Seems like we sorted this out with the %run
magic and calling the Start_Dask_Cluster_*.ipynb
notebooks.
Closing.
I think I know what you mean by 'setup notebook' -- a common document with infrastructure and other environment-setting code which is useful to execute early on in other notebooks. If that's not what you mean, then I need to come up to speed on the intent.
If that is the intent... here are some options to consider:
%run setup.ipynb
-- run executes the contents of the named notebook as if it were copy-pasted in the current notebook. That's the way I had been handling setup for the books I have been working on. This has an interesting advantage in that it can re-use previous tutorials. That is, a document that needs to useD-Score
(for example) could conceivably just%run
the notebook that describes what D-Score is and how it's built.%load setup.ipynb
-- load fetches the named file and puts it in the current cell of the notebook. This option is a little 'uglier' (my opinion), but it has some real advantages...%load
can load from a URL; it isn't limited to the file system. This means that a change to the loaded file will automatically be interpreted in future notebooks.sys.path
to point to a localized copy of the module code.What I had been doing for set-up was a combination of
%run
and a localized module namespace. Certainly not tied to that... but would welcome a discussion to find a suitable strategy to use across the whole project.