Closed KrisThielemans closed 3 years ago
should also not download to ~/data which is often destroyed... Ideally should be able to specify download to e.g. /devel/data
in docker containers
I think they could be downloaded to a directory within this repo that we .gitignore?
Also needs to be updated for BrainWeb in the synergistic folder.
I'm happy to take this, but I have limited time this week and next due to a disrupted schedule. Does SIRF-Exercises 3.0 need to be released at the same time as SIRF?
(i.e., if there is some flexibility please assign me)
@ashgillman I think @evgueni-ovtchinnikov hasn't had time for this yet. If this is something you could tackle today or so, maybe it'll help him.
Sure, I'll have a look now. But I'm still not sure we've agreed on an ideal location? I'm no longer convinced my suggestion to load into the SIRF-exercises is a good idea.
I don't like the idea of where it is now, being the install location. It should be a user-accessible location by default.
A pretty common pattern would be ~/.<appname>/...
, so ~/.sirf-exercises/share/
or ~/.sirf/share/
, or perhaps like ~/.local/share/sirf-exercises/
and then have this overridable by some variable like $SIRF_EXERCISES_PATH
that is set in the VM or Docker to whatever else is wanted.
Note that there are actually 2 places that we are talking about
I've changed the title of the issue to make this clearer.
The trouble is that ~
is not ideal on our docker
containers. It will be local to the container, which means that whenever people upgrade, they will have tno longer have access to the data , unless they copy it etc. At the moment, the container is set-up such that it mounts a directory from the user in /devel
(not `~/devel' by the way).
Some other difficulties
~
doesn't usually work very well in Windows...$SIRF_EXERCISES_PATH
means more set-up, more code to read it and handle it if it doesn't exist etc.I think for a novice user, it makes most sense if the data are extracted "in" SIRF-exercises
somewhere. It also means that the notebooks "know" where it is (as they can find out where they are presumably). I think SIRF-exercises/data
makes most sense. An advanced user can link that to somewhere else I guess
Defaulting the download itself to something like /tmp
might make sense.
The small problem with SIRF-exercises/data
is its a little bit more delicate. In notebooks, would we use ../../data/...
? Then if they are moved they have to change - not a terrible sacrifice though.
An advanced user can link that to somewhere else I guess
But we don't supply that mechanism right? You want simple - no environment variable lookup?
In a Notebook using the Docker container, from the Files menu, the user only sees the contents of the container's /devel
folder, so the data needs to be there or below. e.g. in /devel/data
or /devel/SIRF-Exercises/data
It is possible to open a terminal from Jupyter and run unix commands like ln
(say if you wanted to have a symbolic link to /tmp
), but that might be overcomplicated.
If you pull an updated Docker container, you do not loose downloaded data because the /devel
in the container is mapped to your local folder: …/SIRF-SuperBuild/docker/devel
which persists.
I think it is fine if the user has to, let's say, uncomment one Notebook line if they are using it via Docker, and a different if they are using a Notebook via VM or whatever. Might be preferable to too much magic in the background.
Oh, one other point, the terminal from Jupyter has a horrible command prompt that appears after every command:
/bin/sh: 1: __git_ps1: not found
(base) sirf:\w$
Defaulting the download itself to something like
/tmp
might make sense.
could be ok, but I'm not too bothered about that. We can say easily to the user that it default to ~/data
but it's up to them to decide (could recommend /devel/data
if on Docker)
The small problem with
SIRF-exercises/data
is its a little bit more delicate. In notebooks, would we use../../data/...
? Then if they are moved they have to change - not a terrible sacrifice though.
I think that's ok for now. Simplicity is better
An advanced user can link that to somewhere else I guess
But we don't supply that mechanism right? You want simple - no environment variable lookup?
yes please.
I'm looking into this:
~/data
that can be changedI was looking into having the location of the data available just to the exercises with a syntaxis like
from common import exercises_data_path
where common
is a module in the SIRF-Exercise/notebooks/common
. However, the current directory structure is:
- notebooks
-> MR
-> PET
-> common
-> __init__.py
...
Now, to make whatever is in common
is accessible to the notebooks I see a few options:
common
to PYTHONPATH
before starting jupyter
import sys, os sys.path.append(os.path.abspath(os.path.join(os.getcwd(), '..')))
3. The download scripts will create a `common.py` file that is actually a symlink to the `__init__.py` file in *each* subdirectory of `notebooks` so that one could use the wanted syntax
None is great:
1. requires the user to set `PYTHONPATH`, could work with Docker and VM
2. would work out of the box, but it's ugly
3. would work in any system where you download the data with the download scripts. It also has the good feature of being able to update with calling the download scripts again.
the download scripts are bash scripts that can't be executed from Windows
Do they need to be ported, e.g., to Python?
the download scripts currently create symlinks to the data in the SIRF source directory (in the build)
I don't like this - it seems to me that that download directory is temporary. I think they should be downloaded directly to the location. There can be a cleanup step if needed to remove zip and md5 files.
I was looking into having the location of the data available just to the exercises with a syntaxis like
I wondered about this, but I'm not sure its worth the additional infrastructure considering each is 3 lines of code - see https://github.com/SyneRBI/SIRF-Exercises/pull/82/files#diff-12bc80ea6d909fbe4bc94174271b7835d3f83b23d5a9d240160e707d49051d23R164Worthwhile I we end up moving a lot of boilerplate code here it could be useful though.
We could just have a symlink in each notebook directory to the common directory, rather than creating one with the download script? Will this work with Windows? I think this is really a separate issue though
Yes, we could create symlinks and store them in github. This poses a problem with filesystems where symbolic links aren't available.
But then, there's not a big difference with the bash scripts.
this seems to be escalating a bit too much. Please stick to a simple solution now (no symlinks, no makor redesign).
we could have a common
folder, but I suggest to add it explicitly to the path for notebooks that need it.
good luck!
@ashgillman and I are working together at a solution on #82 and https://github.com/ashgillman/SIRF-Exercises/pull/1
Running
download_*sh
puts symbolic links inSIRF/data/examples
resulting then ingit
saying thedata
folder has changed (always tricky with submodules).It would be better to just put this somewhere else, and then adapt the relevant demos.