Jupyterhub: reconstruc_measured_data error

paskino commented 3 years ago

Running the convertSiemensInterfileToSTIR.sh script one needs to use the following instead of what is in the notebook (the first 2 commented lines)

# !convertSiemensInterfileToSTIR.sh $data_path/20170809_NEMA_UCL.n.hdr norm.n.hdr
# !convertSiemensInterfileToSTIR.sh $data_path/20170809_NEMA_MUMAP_UCL.v.hdr umap.v.hdr
!PATH=/opt/SIRF-SuperBuild/INSTALL/bin/:$PATH convertSiemensInterfileToSTIR.sh /mnt/materials/SIRF/Fully3D/SIRF/PET/mMR/NEMA_IQ/20170809_NEMA_UCL.n.hdr norm.n.hdr
!PATH=/opt/SIRF-SuperBuild/INSTALL/bin/:$PATH convertSiemensInterfileToSTIR.sh /mnt/materials/SIRF/Fully3D/SIRF/PET/mMR/NEMA_IQ/20170809_NEMA_MUMAP_UCL.v.hdr umap.v.hdr

notice that expanding $data_path (which is by the way correct) failed for me with Input file is not readable

paskino commented 3 years ago

This was originally posted in #123

In the PET reconstruct real data notebook we rely on the user to run the script convertSiemensInterfileToSTIR.sh to convert the Siemens to a STIR readable interfile file header. Now, on the jupyterhub the cell that is supposed to run that does not work for multiple reasons.

I have therefore pre-run the script and the output is in the same directory of the data (in the read-only filesystem).

There’s a cell which I copied here that sets the paths of the various files.

#%% set filenames 
# input files
list_file = os.path.join(data_path, '20170809_NEMA_60min_UCL.l.hdr')
norm_file = 'norm.n.hdr'
attn_file = 'umap.v.hdr'
# output filename prefixes
sino_file = 'sino'
%ls

This cell is presupposing that the users have run the script themselves and the result is in the current working directory. This is not the case in jupyterhub. So what needs doing is either:

give the full path of norm.n.hdr and umap.v.hdr (which are in the same data_path as the list_file)
copy the files in the current working directory

In either case the notebook requires some modifications. Which is the preferred solution?

I vote for 1.

KrisThielemans commented 3 years ago

see my reply for option 1 in https://github.com/SyneRBI/SIRF-Exercises/issues/123#issuecomment-866865458

ashgillman commented 3 years ago

I'll have a look at this tonight in greater detail.

But yes, the cells are a little unclear.

# input files
list_file = os.path.join(data_path, '20170809_NEMA_60min_UCL.l.hdr')
norm_file = 'norm.n.hdr'
attn_file = 'umap.v.hdr'

despite being labelled "input files", norm_file and attn_file are actually the outputs from the convertSiemensInterfileToSTIR.sh - which is why they are in the working_folder.

These file names aren't actually used until the cell is run, so the notebook should run fine? It did for me.

isn't okay, because those files need to be converted? I thought we had a preference to include the convertSiemensInterfileToSTIR.sh in the notebook (not pre-run it) so that users know they would have to run it themselves if they have their own data?

ashgillman commented 3 years ago

It should, more explicitly, be like:

# input files
list_file = os.path.join(data_path, '20170809_NEMA_60min_UCL.l.hdr')
norm_file = os.path.join(data_path, '20170809_NEMA_UCL.n.hdr'
attn_file = os.path.join(data_path, '20170809_NEMA_MUMAP_UCL.v.hdr'

# converted to STIR format files
norm_conv_file = 'norm.n.hdr'
attn_conv_file = 'umap.v.hdr'

and then:

!PATH=/opt/SIRF-SuperBuild/INSTALL/bin/:$PATH convertSiemensInterfileToSTIR.sh $norm_file $norm_conv_filr
!PATH=/opt/SIRF-SuperBuild/INSTALL/bin/:$PATH convertSiemensInterfileToSTIR.sh $attn_file $attn_conv_file

KrisThielemans commented 3 years ago

ok. that could work.

obviously a pity of the PATH (as it looks very weird, but probably would work anywhere, as you can have non-existent directories in a path)

By the way, there is no need to run convertSiemensInterfileToSTIR on the .n.hdr anymore as far as I recall. but still need to do the sed thing.

And by the way, the sed line will fail on MacOS. You need sed -i.bak ... (see https://stackoverflow.com/a/7573438/15030207)

paskino commented 3 years ago

The problem is that the convert script on the jupyterhub instance is not in /opt/SIRF-SuperBuild/INSTALL/bin/ rather in ${SIRF_PATH}/../STIR/scripts/IO, i.e. /opt/SIRF-SuperBuild/sources/STIR/scripts/IO. On the contrary in the VM it is installed in ~/devel/install/bin.

But, how does it get installed? Apparently one should need to set BUILD_EXECUTABLES=ON in STIR to install it. This can be set by STIR_BUILD_EXECUTABLES in the SuperBuild.

In none of docker or VM we set STIR_BUILD_EXECUTABLES=ON.

In the VM update_to_full_STIR.sh will do it but it is not run by default.

KrisThielemans commented 3 years ago

https://github.com/UCL/STIR/issues/894. I can't see why it'd be installed on the VM.

However, I think the notebook says to try that, so I wouldn't try to fix this now.

I don't think we should set STIR_BUILD_EXECUTABLES=ON as the images become quite a lot larger.

ashgillman commented 3 years ago

Can we just add to PATH in the Dockerfile?

ashgillman commented 3 years ago

Now I'm confused why the PATH is needed - it is in the .bashrc.

I've tested the Notebook in Docker and for me it "just works" $^{TM}$

paskino commented 3 years ago

OK, so ${SIRF_PATH}/../STIR/scripts/IO should be universal.

paskino commented 3 years ago

Now I'm confused why the PATH is needed - it is in the .bashrc.

I've tested the Notebook in Docker and for me it "just works" $^{TM}$

Yes we could, however the kubernetes/jupyterhub configuration is particularly different and .bashrc isn't sourced. So even starting gadgetron requires us to specify PATH=/opt/SIRF-SuperBuild/INSTALL/bin:$PATH gadgetron

paskino commented 3 years ago

OK the last bit is that the norm.n.hdr and umap.v.hdr are in a different directory from where the data is.

Adding this at their creation time makes things work

sed_cmd = 's#\(!name of data file:=\)#\\1{}/#'.format(data_path)
os.system("cat umap.v.hdr | sed -e '{}' > tmp".format(sed_cmd))
! mv tmp umap.v.hdr
os.system("cat norm.n.hdr | sed -e '{}' > tmp".format(sed_cmd))
! mv tmp norm.n.hdr
! rm tmp

KrisThielemans commented 3 years ago

sigh. I guess we really should have this as part of the convert*.sh, but can't do that anymore. Can you create a STIR issue for that? (in your spare time).

minor simplification

sed_cmd = 's#\(!name of data file:=\)#\\1{}/#'.format(data_path)
os.system("cat umap.v.hdr | sed -i.bak -e '{}' > tmp".format(sed_cmd))
os.system("cat norm.n.hdr | sed -i.bak -e '{}' > tmp".format(sed_cmd))

Note that Johannes has some neat trick with using %%bash at the start of the cell and not needing Python at all, but whatever works really.

paskino commented 3 years ago

Sure! I'll do it in my spare time :D

ashgillman commented 3 years ago

I'd have just copied the data file, but this saves space I guess!

SyneRBI / SIRF-Exercises

Jupyterhub: reconstruc_measured_data error #120