SyneRBI / SIRF-Exercises

SIRF Training and demonstration material
http://www.ccpsynerbi.ac.uk
Apache License 2.0
18 stars 21 forks source link

should not extract downloaded data into SIRF/data/examples #22

Closed KrisThielemans closed 3 years ago

KrisThielemans commented 5 years ago

Running download_*sh puts symbolic links in SIRF/data/examples resulting then in git saying the data folder has changed (always tricky with submodules).

It would be better to just put this somewhere else, and then adapt the relevant demos.

casperdcl commented 4 years ago

should also not download to ~/data which is often destroyed... Ideally should be able to specify download to e.g. /devel/data in docker containers

ashgillman commented 3 years ago

I think they could be downloaded to a directory within this repo that we .gitignore?

Also needs to be updated for BrainWeb in the synergistic folder.

I'm happy to take this, but I have limited time this week and next due to a disrupted schedule. Does SIRF-Exercises 3.0 need to be released at the same time as SIRF?

ashgillman commented 3 years ago

(i.e., if there is some flexibility please assign me)

KrisThielemans commented 3 years ago

@ashgillman I think @evgueni-ovtchinnikov hasn't had time for this yet. If this is something you could tackle today or so, maybe it'll help him.

ashgillman commented 3 years ago

Sure, I'll have a look now. But I'm still not sure we've agreed on an ideal location? I'm no longer convinced my suggestion to load into the SIRF-exercises is a good idea.

I don't like the idea of where it is now, being the install location. It should be a user-accessible location by default.

A pretty common pattern would be ~/.<appname>/..., so ~/.sirf-exercises/share/ or ~/.sirf/share/, or perhaps like ~/.local/share/sirf-exercises/ and then have this overridable by some variable like $SIRF_EXERCISES_PATH that is set in the VM or Docker to whatever else is wanted.

KrisThielemans commented 3 years ago

Note that there are actually 2 places that we are talking about

I've changed the title of the issue to make this clearer.

The trouble is that ~ is not ideal on our docker containers. It will be local to the container, which means that whenever people upgrade, they will have tno longer have access to the data , unless they copy it etc. At the moment, the container is set-up such that it mounts a directory from the user in /devel (not `~/devel' by the way).

Some other difficulties

I think for a novice user, it makes most sense if the data are extracted "in" SIRF-exercises somewhere. It also means that the notebooks "know" where it is (as they can find out where they are presumably). I think SIRF-exercises/data makes most sense. An advanced user can link that to somewhere else I guess

ashgillman commented 3 years ago

Defaulting the download itself to something like /tmp might make sense.

The small problem with SIRF-exercises/data is its a little bit more delicate. In notebooks, would we use ../../data/...? Then if they are moved they have to change - not a terrible sacrifice though.

ashgillman commented 3 years ago

An advanced user can link that to somewhere else I guess

But we don't supply that mechanism right? You want simple - no environment variable lookup?

DANAJK commented 3 years ago

In a Notebook using the Docker container, from the Files menu, the user only sees the contents of the container's /devel folder, so the data needs to be there or below. e.g. in /devel/data or /devel/SIRF-Exercises/data It is possible to open a terminal from Jupyter and run unix commands like ln (say if you wanted to have a symbolic link to /tmp), but that might be overcomplicated.

If you pull an updated Docker container, you do not loose downloaded data because the /devel in the container is mapped to your local folder: …/SIRF-SuperBuild/docker/devel which persists.

I think it is fine if the user has to, let's say, uncomment one Notebook line if they are using it via Docker, and a different if they are using a Notebook via VM or whatever. Might be preferable to too much magic in the background.

DANAJK commented 3 years ago

Oh, one other point, the terminal from Jupyter has a horrible command prompt that appears after every command:

/bin/sh: 1: __git_ps1: not found
(base) sirf:\w$
KrisThielemans commented 3 years ago

Defaulting the download itself to something like /tmp might make sense.

could be ok, but I'm not too bothered about that. We can say easily to the user that it default to ~/data but it's up to them to decide (could recommend /devel/data if on Docker)

The small problem with SIRF-exercises/data is its a little bit more delicate. In notebooks, would we use ../../data/...? Then if they are moved they have to change - not a terrible sacrifice though.

I think that's ok for now. Simplicity is better

An advanced user can link that to somewhere else I guess

But we don't supply that mechanism right? You want simple - no environment variable lookup?

yes please.

paskino commented 3 years ago

I'm looking into this:

  1. the download scripts are bash scripts that can't be executed from Windows
  2. the download scripts have a default download location ~/data that can be changed
  3. the download scripts currently create symlinks to the data in the SIRF source directory (in the build)

I was looking into having the location of the data available just to the exercises with a syntaxis like


from common import exercises_data_path

where common is a module in the SIRF-Exercise/notebooks/common. However, the current directory structure is:

- notebooks
 -> MR
 -> PET
 -> common
   -> __init__.py
...

Now, to make whatever is in common is accessible to the notebooks I see a few options:

  1. we add the path to common to PYTHONPATH before starting jupyter
  2. In each notebook we add some horrible syntax like

import sys, os sys.path.append(os.path.abspath(os.path.join(os.getcwd(), '..')))


3. The download scripts will create a `common.py` file that is actually a symlink to the `__init__.py` file in *each* subdirectory of `notebooks` so that one could use the wanted syntax

None is great: 
1. requires the user to set `PYTHONPATH`, could work with Docker and VM
2. would work out of the box, but it's ugly
3. would work in any system where you download the data with the download scripts. It also has the good feature of being able to update with calling the download scripts again.
ashgillman commented 3 years ago

the download scripts are bash scripts that can't be executed from Windows

Do they need to be ported, e.g., to Python?

the download scripts currently create symlinks to the data in the SIRF source directory (in the build)

I don't like this - it seems to me that that download directory is temporary. I think they should be downloaded directly to the location. There can be a cleanup step if needed to remove zip and md5 files.

I was looking into having the location of the data available just to the exercises with a syntaxis like

I wondered about this, but I'm not sure its worth the additional infrastructure considering each is 3 lines of code - see https://github.com/SyneRBI/SIRF-Exercises/pull/82/files#diff-12bc80ea6d909fbe4bc94174271b7835d3f83b23d5a9d240160e707d49051d23R164Worthwhile I we end up moving a lot of boilerplate code here it could be useful though.

We could just have a symlink in each notebook directory to the common directory, rather than creating one with the download script? Will this work with Windows? I think this is really a separate issue though

paskino commented 3 years ago

Yes, we could create symlinks and store them in github. This poses a problem with filesystems where symbolic links aren't available.

But then, there's not a big difference with the bash scripts.

KrisThielemans commented 3 years ago

this seems to be escalating a bit too much. Please stick to a simple solution now (no symlinks, no makor redesign).

we could have a common folder, but I suggest to add it explicitly to the path for notebooks that need it.

good luck!

paskino commented 3 years ago

@ashgillman and I are working together at a solution on #82 and https://github.com/ashgillman/SIRF-Exercises/pull/1