NeuroDesk / transparent-singularity

Deploying a singularity container so that it behaves like one would have installed software natively
https://neurodesk.org
MIT License
4 stars 2 forks source link

Non-default TMPDIR is not handled #11

Closed marcelzwiers closed 2 months ago

marcelzwiers commented 4 months ago

I'm creating a new issue here, as a follow-up on my Zoom meeting with @stebo85. In that meeting (after solving the Azure issues) we successfully ran rstudio on our CentOS 7.9 HPC cluster, by using a temporary home (because it was thought that something in home was setting a wrong path to R).

At least that's what I thought, because I can no longer reproduce that result. :-(

I sat down with our sysadmin @hurngchunlee and in fact this time we could only get rstudio to work by removing the --cleanenv option. We finally nailed down the problem to $TMP, which on our HPC is on a non-default location (the R library wants to write to the non-writable default /tmp and then fails). On our side, this non-default location is normally automatically passed to apptainer, but this goes wrong with the neurocontainers.

So in summary: The R-path was not the problem and adding --env TMP=$TMP to the default rstudio call made it all work normally (i.e. with the normal --cleanenv and without using --home). Is this TMPDIR handling something that can or needs to be added to transparent singularity?

stebo85 commented 4 months ago

Dear @marcelzwiers, oh, interesting! Maybe we fixed part of this problem yesterday and this is now a separate problem? It doesn't quite explain the problems we saw first with rstudio complaining about a missing R, but it explains the second error we got when it complained about writing to tmp

Yes, that should be a safe change to make for neurocommand and I can add this. Before I do this, I just want to doublecheck: Is it really --env TMP=$TMP ?

On our HPC it's TMPDIR and TMP is empty

[uqsbollm@bun115 ~]$ echo $TMP

[uqsbollm@bun115 ~]$ echo $TMPDIR
/scratch/temp/8851948
marcelzwiers commented 4 months ago

By default /tmp is mounted (/tmp:/tmp), so I suppose that on your HPC /tmp is writeable for all users? I think yesterday we identified the problem to be related to an R-path that was somewhere set in my homedir, and we saw that the problem was resolved when we used --home. However, I can't replicate that, i.e. if I use --home again, the same R installation problem persists. But if I remove --cleanenv then the problem is resolved again. I suspect when we used --home we accidentally removed --cleanenv in the process?

Anyhow, to come back to the your question, yes, I used TMP as an example, a more rigorous handling would also include the other commonly variable names:

apptainer run --cleanenv --env TMP=$TMP,TMPDIR=$TMPDIR,TEMP=$TEMP,TEMPDIR=$TEMPDIR etc

There is however, and alternative and more robust solution (some badly written apps just write to /tmp and ignore the env variables) that I use at our centre, which is mounting the non-default tmpdir inside the container as /tmp:

apptainer run --cleanenv --bind $TMPDIR:/tmp etc

However, though more robust, I see this latter approach is less portable to other environments...

hurngchunlee commented 4 months ago

Yes, that should be a safe change to make for neurocommand and I can add this. Before I do this, I just want to doublecheck: Is it really --env TMP=$TMP ?

Hi @stebo85 just to answer your question, yes, --env TMP=$TMPDIR will work on DCCN's HPC cluster; and I think it is better than TMP=$TMP as on the Slurm compute node, the $TMP seems to be unset. Here is the output on our compute node:

517 $ echo $TMPDIR
/scratch/honlee/slurm_job_4463

518 $ echo $TMP