Closed AroneyS closed 5 months ago
Integration test test_short_read_recovery
works for me without any weird files popping up.
(actually fails at the end due to the odd version of smafa
that I have...)
But test_long_read_recovery
fails at the spades_assembly
rule due to /bin/bash: line 11: TMPDIR: unbound variable
@rhysnewell can we move spades_assembly into a script?
Yeah, that would probably be better than the current shell command. However, spades_assembly_coverage
also uses a shell command and looks like it is manually setting the TMPDIR
variable. I imagine that this might cause some issues as well, so I think all of the shell commands should be considered for porting to python scripts if they interact with TMPDIR
in any way
So @YibiChen noted some oddities with how tmpdir was being set, just want to verify that this was intended behaviour before merging into dev
@YibiChen https://github.com/rhysnewell/aviary/pull/201#issuecomment-2073597723 for which rule?
Actually all rules submitted jobs to remote nodes, according to the qsub_logs from pbs. In the test case test_short_read_recovery_queue_submission
, I noticed all the jobs running on the remote jobs were still using /tmp
, not the bespoke TMPDIR
on each node. I wonder if this is something to be fixed here in aviary.
Ok, it seems to be a known issue with snakemake. See https://github.com/snakemake/snakemake/issues/522. One way to work around is to use shadow_prefix
Or something like this: https://github.com/snakemake/snakemake/issues/522#issuecomment-1481050447?
Or get it to set resources.tmpdir within the job. Maybe as addition to mqsub-snakemake?
Similar issue: https://github.com/snakemake/snakemake/issues/2423
Adding if workflow.run_local
should work for us. This is only removed after snakemake 8.0 and we are using 7
Adding
if workflow.run_local
should work for us. This is only removed after snakemake 8.0 and we are using 7
I'd prefer not to use something that has been deprecated in future versions of snakemake if possible
It might be due to default-resources. In another snakemake, the command that is run within the job is a snakemake command with --default-resources 'tmpdir=system_tmpdir'
, even though I haven't set that anywhere.
@YibiChen can you try setting tmpdir in default resources to none? e.g. https://snakemake.readthedocs.io/en/stable/snakefiles/rules.html#default-resources
@AroneyS I have tried to set tmpdir in default resources to none, and got an error directly. I think there some builded-in check to make sure tmpdir is given. It makes sense because the tmpdir in default resources is critical and would be used to overwrite the tmdir in the environment once the job is submmited to clusters. That's why our python scripts did not work. The tmpdir value given by PBS is never accessible as it was overwritten by snakemake immediately. It is designed to do so according to https://snakemake.readthedocs.io/en/stable/snakefiles/rules.html#standard-resources .
I believe the snakemake team is aware of this problem and fix may come out. For now, we may just exam whether there is a PBS_JOBID
env variable and test whether /data1/pbs.$PBS_JOBID.pbs
is writable. If true then just use it as tmpdir.
What do you think "In cluster or cloud setups, its evaluation is delayed until the actual execution of the job." means from your link? What is it evaluating? Can we set --default-resources tmpdir="$TMPDIR"
to load the TMPDIR
var from the job env?
@YibiChen actually maybe you can try the same setup with updated snakemake? Though there are heaps of breaking changes...
Edit: https://github.com/snakemake/snakemake/pull/1860 hmm...
This is what I understand. The evaluation is referring to "setting the $TMPDIR variable for shell commands" (i.e. parsing resources.tmpdir to $TMPDIR in the env). It is delayed so it can overwrite the $TMPDIR given by PBS. --default-resources tmpdir="$TMPDIR"
is just the same as the default (i.e. 'tmpdir=system_tmpdir'
). Because this part is executed in the main snakemake cmd. We would still get "/tmp". I will try the undated snakemake and see how it goes.
Problem "sloved". It turns out the python scripts worked but I had "export TMPDIR=/tmp" in my bashrc... and I totally forgot about it.
I have tested it again after removed that line. It works good now.
Problem "sloved". It turns out the python scripts worked but I had "export TMPDIR=/tmp" in my bashrc... and I totally forgot about it.
I have tested it again after removed that line. It works good now.
that's pretty funny, but did you put this in your .bashrc
or has the aviary
configuration gone awry? It can write export variables to your .bashrc
if you don't install it within an environment and run aviary configure
, so just want to make sure it hasn't gone crazy
Problem "sloved". It turns out the python scripts worked but I had "export TMPDIR=/tmp" in my bashrc... and I totally forgot about it. I have tested it again after removed that line. It works good now.
that's pretty funny, but did you put this in your
.bashrc
or has theaviary
configuration gone awry? It can write export variables to your.bashrc
if you don't install it within an environment and runaviary configure
, so just want to make sure it hasn't gone crazy
No worries. It's nothing to do with aviary configure
. It was just me been silly. I got complains from the HPC team about not using the correct tmpdir so I just manually added that that my bashrc and totally forgot about it.
@rhysnewell new release?
yeah, i'll spin it up
Prior to this pull-request,
TMPDIR
was set usingtempfile.gettempdir()
at the time and using the current env of the Aviary command being run. However, when jobs are submitted to a HPC cluster, often they have a bespokeTMPDIR
set for each job (e.g. to clean up tmp files after job is finished).So, essentially, the PR is to allow jobs to fall back to their
TMPDIR
env variable if one was not forced by using--tmpdir
.