remaining issues with exercises_data_path

KrisThielemans commented 3 years ago

PET notebooks generally do the following

os.chdir(examples_data_path('PET'))
# Copy files to a working folder and change directory to where these files are.
# We do this to avoid cluttering your SIRF files. This way, you can delete 
# working_folder and start from scratch.
shutil.rmtree('working_folder/brain',True)
shutil.copytree('brain','working_folder/brain')
os.chdir('working_folder/brain')

However, this means they are creating the working_folder in the examples_data_path as opposed to the exercises_data_path. Best to change to

os.chdir(exercises_data_path('PET'))
# Copy files to a working folder and change directory to where these files are.
# We do this to avoid cluttering your SIRF files. This way, you can delete 
# working_folder and start from scratch.
shutil.rmtree('working_folder/brain',True)
shutil.copytree(os.path.join(examples_data_path('PET'),'brain'),'working_folder/brain')
os.chdir('working_folder/brain')

KrisThielemans commented 3 years ago

In addition, the chdir has the consequence that if the user executes the import statements again, python complains about exercises_data_path not being found.

I therefore now favour my original solution : at the start of the notebook, do

# add location of SIRF-exercises Python utilities to the path
# (execute this only once for this notebook as it is based on the current working directory)
common_path=os.path.join(os.getcwd(),'..','common')
sys.path.append(common_path)

We could later also add the current directory to the path, such that PET or MR could have their own scripts.

This approach also means that we don't need to rely on download_data.sh to overwrite the examples_data_path.py files.

KrisThielemans commented 3 years ago

Or possibly better would be set it up such that we should do

from sirf_exercises import exercises_data_path

KrisThielemans commented 3 years ago

Also confusing is that exercises_data_path is a string, while examples_data_path is a function (to be called with arg PET for instance), hence my suggested example code above is incorrect.

I think better to follow the latter's strategy (for uniformity)

KrisThielemans commented 3 years ago

This approach also means that we don't need to rely on download_data.sh to overwrite the examples_data_path.py files.

hmmm. we do allow download_data.sh to use a different location, so we need to tell Python where it is...

KrisThielemans commented 3 years ago

Another issue is that exercise_data_path.py files are flagged as changed by git after the download.

KrisThielemans commented 3 years ago

Another issue is that exercise_data_path.py files are flagged as changed by git after the download.

I think this is because they are tracked, so modifications won't be ignored.

Best seems to be to optionally add a file with the path used by download_data.sh, and let exercise_data_path.py see if that file exist, if not, revert to the default.

ashgillman commented 3 years ago

Taking these one-at-a-time:

PET notebooks generally do the following

os.chdir(examples_data_path('PET'))
# Copy files to a working folder and change directory to where these files are.
# We do this to avoid cluttering your SIRF files. This way, you can delete 
# working_folder and start from scratch.
shutil.rmtree('working_folder/brain',True)
shutil.copytree('brain','working_folder/brain')
os.chdir('working_folder/brain')

However, this means they are creating the working_folder in the examples_data_path as opposed to the exercises_data_path. Best to change to

os.chdir(exercises_data_path('PET'))
# Copy files to a working folder and change directory to where these files are.
# We do this to avoid cluttering your SIRF files. This way, you can delete 
# working_folder and start from scratch.
shutil.rmtree('working_folder/brain',True)
shutil.copytree(os.path.join(examples_data_path('PET'),'brain'),'working_folder/brain')
os.chdir('working_folder/brain')

This would take us back to our original issue - that code in the exercises would be modifying files in the examples data path. I think it would be nicer to have each notebook create its own working directory in the notebook location?

ashgillman commented 3 years ago

In addition, the chdir has the consequence that if the user executes the import statements again, python complains about exercises_data_path not being found.

I therefore now favour my original solution : at the start of the notebook, do
# add location of SIRF-exercises Python utilities to the path
# (execute this only once for this notebook as it is based on the current working directory)
common_path=os.path.join(os.getcwd(),'..','common')
sys.path.append(common_path)
We could later also add the current directory to the path, such that PET or MR could have their own scripts.

This approach also means that we don't need to rely on download_data.sh to overwrite the examples_data_path.py files.

Yes, I agree that appending path seems less evil. Then we don't need many exercises_path_path.py files, just one in common. I think @paskino disagreed here?

ashgillman commented 3 years ago

Also confusing is that exercises_data_path is a string, while examples_data_path is a function (to be called with arg PET for instance), hence my suggested example code above is incorrect.

I think better to follow the latter's strategy (for uniformity)

Fair criticism, I agree

ashgillman commented 3 years ago

This approach also means that we don't need to rely on download_data.sh to overwrite the examples_data_path.py files.

hmmm. we do allow download_data.sh to use a different location, so we need to tell Python where it is...

As far as I can see, the options are to "configure" a file, like we currently do - or have both of these look up an env variable with a common default if its not set, per my original proposal. NB: If the former, we can still "configure" this file into a sirf_exercises subdirectory that operate as our library directory as you suggested.

ashgillman commented 3 years ago

Another issue is that exercise_data_path.py files are flagged as changed by git after the download.

This seems like a bug - I thought I'd .gitignore'ed these files - but I think I misspelled it

ashgillman commented 3 years ago

So perhaps we have a sirf_exercises directory below the project root. Or in common or whatever. download_data.sh writes a file here, sirf_exercises/exercises_data_path.py (or even just a text file...) with the location. I still would prefer an ENV var for this instead of writing a file, but I think I'm alone here 😄. We can then also put any other lib functions here, e.g., plotting.py. And it needs an __init__.py.

There are two options to make this accessible:

sys.path.append('../..') or its variants in every notebook.
Add a setup.py to the project root - it could be quite simple, and run pip install --user .

I think if we're going as far as having a library for the notebooks, (2) isn't too far a stretch. One problem I can see is that if we use the "configure" option for the exercises_data_path, then you need to run download_data.sh before pip install --user ..

KrisThielemans commented 3 years ago

Best to change to
os.chdir(exercises_data_path('PET'))
# Copy files to a working folder and change directory to where these files are.
# We do this to avoid cluttering your SIRF files. This way, you can delete 
# working_folder and start from scratch.
shutil.rmtree('working_folder/brain',True)
shutil.copytree(os.path.join(examples_data_path('PET'),'brain'),'working_folder/brain')
os.chdir('working_folder/brain')
This would take us back to our original issue - that code in the exercises would be modifying files in the examples data path. I think it would be nicer to have each notebook create its own working directory in the notebook location?

?? the above code (should have) copied files into exercises_data_path('PET')/working_folder/brain (after copying it from examples_data_path...)

ashgillman commented 3 years ago

Ah yes - sorry I misread. I guess also related to #84. Or at least they could have the same fix. Should we have discussion there about where working_folder should go?

ashgillman commented 3 years ago

Updates after #94:

[x] Fix remaining PET Notebooks
[x] chdir is absolute so can be run multiple times
[x] from sirf_exercises import exercises_data_path
[x] examples_data_path -> examples_data_path()
[x] exercise_data_path.py flagged by git - this was because .gitingore is only for new files, not modified - now we don't modify a file but create
[ ] Separate working directories from data_directories

ashgillman commented 3 years ago

Do we still want separate working directories now that we have DOWNLOAD_DIR? I guess its still nice to have

KrisThielemans commented 3 years ago

I think nice to have, but more important to clean up existing notebooks first (especially the PET ones!).

ashgillman commented 3 years ago

I think once #94 is done this can be closed, #84 covers the last item

SyneRBI / SIRF-Exercises

remaining issues with exercises_data_path #89