Closed royzawadzki closed 6 years ago
Heyo! I am running home from doctor but will be able to help you when I am back... so glad you are jumping in to using the tool and stay tuned! 🐶
@vsoch No worries, someone's gotta ask all the dumb questions, right?
These are not dumb questions at all! I'm back at my computer and testing this out now. First, a few comments:
First let's just make sure we are working from the same thing. Make sure your forward repository is up to date with the latest on Github, and that you have run setup.sh so that there is a CONTAINERSHARE
and RESOURCE
variable in your params.sh
USERNAME="vsochat"
PORT="43453"
PARTITION="russpold"
RESOURCE="sherlock"
MEM="20G"
TIME="8:00:00"
CONTAINERSHARE="/scratch/users/vsochat/share"
And you also should have run the hosts/ssh_sherlock.sh
so that you have your ssh configuration in ~/.ssh/config
like:
Host sherlock
User vsochat
Hostname sh-ln06.stanford.edu
GSSAPIDelegateCredentials yes
GSSAPIAuthentication yes
ControlMaster auto
ControlPersist yes
ControlPath ~/.ssh/%l%r@%h:%p
And this would mean that if you type ssh sherlock
you can issue a command after it! Eg:
ssh sherlock squeue -u vsochat
The first time you do that in a terminal, you will have to authenticate. The times after that you won't :)
Okay let's step back for a second and talk about your use case.
If you are creating an environment and notebooks that you want to move around, publish or otherwise share, then you would want to use a jupyter template and build your own container (already with the notebooks inside) and this is done just by copying repo2docker-julia and adding your notebooks, and building, and then running the command to point to your build, e.g.,:
bash start.sh sherlock/singularity-notebook docker://<username>/<repository>
And actually, you might still want to do this when your notebook is done and shiny and ready to submit to a paper, but I intuit from your post that you want more of a working environment, brought up on the fly, without much work in advance. So let's talk about this use case (and we can get back to the first when you are ready to publish!)
This second use case is what I think you want, and it's only non reproducible because we aren't going to use a container (we will use modules on sherlock which may not always be there, might change, etc.) and the notebook files you also want to specify sort of "on the fly." The bug I see in what you are describing (and this is also a bug in my documentation not making it clear) is that the folder BPA would need to already be on the cluster somewhere (the path that you provide is relative to the cluster and not the local machine). BUT we can add a quick command to make this easy. Let's write up an example for how we would get this from the host.
Let's say I have a folder at /tmp/analysis
with an analysis of interest! This already isn't reproducible because theoretically I've created this notebook with some jupyter notebook on my host that might have a mismatch in kernel with one on the cluster. Let's assume that it's the same. Here is the folder:
tree /tmp
numpy_notebook.ipynb
And in this numpy notebook I have a Python 3 kernel that is pretty simple and useless, but will run something:
import numpy
stuffy = numpy.zeros((4,4))
print(stuffy)
Okay, so now I want to use this on sherlock, using forward. The first thing I want to do is copy the entire directory somewhere on the cluster, and I can use scp
for that:
# Here is how we can make a directory to move our stuff to!
ssh sherlock mkdir -p /scratch/users/vsochat/my-analysis
# Now let's copy everything from the local folder there
scp /tmp/analysis/* vsochat@login.sherlock.stanford.edu:/scratch/users/vsochat/my-analysis
numpy_notebook.ipynb 100% 846 0.8KB/s 00:00
v
If you want you can do another ssh sherlock command to check that it worked!
$ ssh sherlock ls /scratch/users/vsochat/my-analysis
numpy_notebook.ipynb
okay cool! Now we want to create the notebook there!
bash start.sh sherlock/py3-jupyter /scratch/users/vsochat/my-analysis
Here is the output. If you don't see exactly something like this, you probably have an older version, and should pull from master (I need to do tags / versions proper, developing pretty quickly and haven't yet!)
== Finding Script ==
Looking for sbatches/sherlock/sherlock/py3-jupyter.sbatch
Looking for sbatches/sherlock/py3-jupyter.sbatch
Script sbatches/sherlock/py3-jupyter.sbatch
== Checking for previous notebook ==
No existing sherlock/py3-jupyter jobs found, continuing...
== Getting destination directory ==
== Uploading sbatch script ==
py3-jupyter.sbatch 100% 146 0.1KB/s 00:00
== Submitting sbatch ==
sbatch --job-name=sherlock/py3-jupyter --partition=russpold --output=/home/users/vsochat/forward-util/py3-jupyter.sbatch.out --error=/home/users/vsochat/forward-util/py3-jupyter.sbatch.err --mem=20G --time=8:00:00 /home/users/vsochat/forward-util/py3-jupyter.sbatch 43453 "/scratch/users/vsochat/my-analysis"
Submitted batch job 23423516
== View logs in separate terminal ==
ssh sherlock cat /home/users/vsochat/forward-util/py3-jupyter.sbatch.out
ssh sherlock cat /home/users/vsochat/forward-util/py3-jupyter.sbatch.err
== Waiting for job to start, using exponential backoff ==
Attempt 0: not ready yet... retrying in 1..
Attempt 1: not ready yet... retrying in 2..
Attempt 2: resources allocated to sh-01-31!..
sh-01-31
sh-01-31
notebook running on sh-01-31
== Setting up port forwarding ==
ssh -L 43453:localhost:43453 sherlock ssh -L 43453:localhost:43453 -N sh-01-31 &
== Connecting to notebook ==
== View logs in separate terminal ==
ssh sherlock cat /home/users/vsochat/forward-util/py3-jupyter.sbatch.out
ssh sherlock cat /home/users/vsochat/forward-util/py3-jupyter.sbatch.err
== Instructions ==
1. Password, output, and error printed to this terminal? Look at logs (see instruction above)
2. Browser: http://sh-02-21.int:43453/ -> http://localhost:43453/...
3. To end session: bash end.sh sherlock/py3-jupyter
Now since this isn't a container, the password is the one that I've set up in advance for jupyter notebook (loading the same module on sherlock, and setting the password, let me know if you haven't done this and need the instruction again, I believe it's in the README)
$ ssh sherlock cat /home/users/vsochat/forward-util/py3-jupyter.sbatch.err
[I 15:20:15.124 NotebookApp] Writing notebook server cookie secret to /tmp/jupyter/notebook_cookie_secret
[I 15:20:29.269 NotebookApp] Serving notebooks from local directory: /scratch/users/vsochat/my-analysis
[I 15:20:29.270 NotebookApp] 0 active kernels
[I 15:20:29.270 NotebookApp] The Jupyter Notebook is running at: http://localhost:43453/
[I 15:20:29.270 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
Then when I open the browser (and enter my password) I get the web interface, and there is my little notebook <3
That should be the complete instructions to get the functionality that you need, and when your notebook is done you would want to make a container (and not have the potential to have errors with versioning, etc.
Let me know if you have more questions!
@vsoch thank you for the detailed response! I wasn't sure if it was on your local machine or not, but it seems like an scp
/sftp
pipeline is the way to go for using notebooks stored locally. On an unrelated note, do you ever plan to roll out sherlock functionality with Jupyter Labs?
You mean beyond just the Jupyter container notebook? (e.g., from the sherlock/containershare-notebook script, using repo2docker-jupyter).
Ohh I see! This guy! --> https://github.com/jupyterlab Oh yes, this would be amazing! Let me look into this, I'll open an issue for further notes.
Also in case you didn't see, our little discussion here is now a "tiny tutorial!" --> https://gist.github.com/vsoch/f2034e2ff768de7eb14d42fef92cc43e meaning he is adorable and forever preserved to be. Thank you!
From here the way to launch jupyter notebooks would on your local computer to do something like argument, but all my permutations lead me into a page with only one folder
bash start.sh <software> <path>
. I've been attempting to play around with theforward-util
with three files forpy3-juptyer.sbatch
,py3-jupyter.sbatch.err
, andpy3-jupyter.sbatch.out
.My situation is that I have a directory on my local computer with two subdirectories: the cloned directory (
forward
) and another directory with my.ipynb
files calledBPA
. Icd
intoforward
to start up sherlock with the following commands and outcomes:bash start.sh sherlock/py3-jupyter ../BPA
with themessage directory not found
and a page with theforward-util
folderthe same thing above also happens when I put the absolute path
After moving the
BPA
directory into theforward
directory and runningbash start.sh sherlock/py3-jupyter BPA
no error about the directory not being found this time, but still launches into the `What is the proper syntax to get the files I want onto the jupyter notebooks page? Is it that the
path
is the path on the actual server?