Closed antoniojbt closed 5 years ago
Hi Antonio,
We have moved cgatcore to bioconda: http://bioconda.github.io/recipes/cgat-core/README.html
and therefore we now recommend the installation using the standard statement:
conda install cgatcore
If python -c "import cgatcore"
works, that's a good sign and there is no need to run the tests.
On the other hand, the permission denied
error you get is because the node where you are running the tests doesn't allow you to create files/folders in /tmp
(which might be the case in some clusters). However, the sysadmin should provide you with a path to a different location with the same functionality (in the CGAT cluster we used /scratch
) so please ask.
I hope it helps.
Best regards, Sebastian
Hi Sebastian,
Thanks, I get the same errors with conda install cgatcore
but it otherwise installs and works without cluster submission. Simple tests with python -c "import cgatcore"
and eg python xxx/pipeline_xxx.py make full --local
work for instance.
The problem may be that here the nodes use $TMPDIR, which points to /var/tmp/ and would create eg /var/tmp/pbs.job_id. Creating directories from the head node or using qsub scripts with $TMPDIR works.
The issue may be that my cluster's settings may not allow mktemp
for example, and only allow /var/tmp/pbs.job_id
? With job_id
set dynamically. See these files as quick tests for a better idea:
ctm_test.qsub.e2331954.txt
ctm_test.qsub.o2331954.txt
ctm_test.qsub.txt
ctm_test.sh.txt
Directory creation works and commands complete except for cleaning up. I'll check with the local HPC admin but if you have any ideas let me know. Best, Antonio
Hi Antonio,
Thanks for sending those files, that clarifies a little bit the issue.
Presumably you have manually edited ctm_test.sh
yourself, specifically these lines:
#TMPDIR=`mktemp -d -p /tmp/`
#TMPDIR=`mktemp -d -p /var/tmp/`
export TMPDIR
Before making any more comments, I am now interested in seeing the original bash script created by cgat-core. Could you please reproduce the error and attach here the ctmp***.sh
script created when the tests fail?
Best regards, Sebastian
Hi Sebastian, Thanks, yes, I edited those files to test what might work. Here is what the original would have looked like: ctmp6635a9a1.sh.txt
I made a tentative pull request (#74), the changes solve the temporary directory writing issue for version 0.5.6. The issue may be site specific though, I don't have access to other clusters and I'm unaware of anyone else using pbspro. I'm getting an error on travis for the latest devel though, I'll check tomorrow what it might be. Best, Antonio
Hi Antonio,
Thanks for the pull request. Let's first troubleshoot a bit further.
See the following lines in file ctmp6635a9a1.sh.txt
:
TMPDIR=`mktemp -d -p /var/tmp/`
export TMPDIR
clean_temp() { rm -rf /var/tmp/; }
Assuming that you are working off the master branch, this is the code: https://github.com/cgat-developers/cgat-core/blob/master/cgatcore/pipeline/execution.py#L572-L586
cluster_tmpdir = get_params()["cluster_tmpdir"]
if self.run_on_cluster and cluster_tmpdir:
tmpdir = cluster_tmpdir
tmpfile.write("TMPDIR=`mktemp -d -p {}`\n".format(tmpdir))
tmpfile.write("export TMPDIR\n")
else:
tmpdir = get_temp_dir(dir=get_params()["tmpdir"],
clear=True)
tmpfile.write("mkdir -p {}\n".format(tmpdir))
tmpfile.write("export TMPDIR={}\n".format(tmpdir))
That means:
self.run_on_cluster and cluster_tmpdir
is true.the following lines of code are not working on your environment: https://github.com/cgat-developers/cgat-core/blob/master/cgatcore/pipeline/execution.py#L572-L576
cluster_tmpdir = get_params()["cluster_tmpdir"]
if self.run_on_cluster and cluster_tmpdir:
tmpdir = cluster_tmpdir
tmpfile.write("TMPDIR=`mktemp -d -p {}`\n".format(tmpdir))
tmpfile.write("export TMPDIR\n")
The shell script created by cgatcore should look something similar to:
TMPDIR=`mktemp -d -p /var/tmp/pbs.job_id/tmpfolder`
export TMPDIR
clean_temp() { rm -rf /var/tmp/pbs.job_id/tmpfolder; }
The culprit must be specifically:
cluster_tmpdir = get_params()["cluster_tmpdir"]
Could you please share the output of the following commands:
mktemp --help
python -c "from cgatcore import pipeline as P; P.get_parameters(); print(P.get_params()['cluster_tmpdir'])"
On both the submission and the execution hosts?
Best regards, Sebastian
Also, are you using a pipeline.yml
file? If so, please cd
to the folder where you have it, and run the following command instead:
cd <working-dir-with-config-yaml>
python -c "import os; from cgatcore import pipeline as P ; P.get_parameters(['pipeline.yml']); print(P.get_params()['cluster_tmpdir'])"
Could you also please share your pipeline.yml
?
Hi Sebastian,
Many thanks for looking into this. Hopefully I've completely overlooked something simple.
Here are the results:
mktemp --help
on the submission host:
Usage: mktemp [OPTION]... [TEMPLATE]
Create a temporary file or directory, safely, and print its name.
TEMPLATE must contain at least 3 consecutive 'X's in last component.
If TEMPLATE is not specified, use tmp.XXXXXXXXXX, and --tmpdir is implied.
Files are created u+rw, and directories u+rwx, minus umask restrictions.
-d, --directory create a directory, not a file
-u, --dry-run do not create anything; merely print a name (unsafe)
-q, --quiet suppress diagnostics about file/dir-creation failure
--suffix=SUFF append SUFF to TEMPLATE; SUFF must not contain a slash.
This option is implied if TEMPLATE does not end in X
-p DIR, --tmpdir[=DIR] interpret TEMPLATE relative to DIR; if DIR is not
specified, use $TMPDIR if set, else /tmp. With
this option, TEMPLATE must not be an absolute name;
unlike with -t, TEMPLATE may contain slashes, but
mktemp creates only the final component
-t interpret TEMPLATE as a single file name component,
relative to a directory: $TMPDIR, if set; else the
directory specified via -p; else /tmp [deprecated]
--help display this help and exit
--version output version information and exit
GNU coreutils online help: <http://www.gnu.org/software/coreutils/>
For complete documentation, run: info coreutils 'mktemp invocation'
mktemp --version
gives:
mktemp (GNU coreutils) 8.30
This is on CentOS Linux 7.
With commit 32b45d575f5c0da954ccf37c65ecb7182fdf9987 on the submission host:
python -c "from cgatcore import pipeline as P; P.get_parameters(); print(P.get_params()['cluster_tmpdir'])"
False
On the execution host I get (without yml files):
Usage: mktemp [OPTION]... [TEMPLATE]
Create a temporary file or directory, safely, and print its name.
TEMPLATE must contain at least 3 consecutive 'X's in last component.
If TEMPLATE is not specified, use tmp.XXXXXXXXXX, and --tmpdir is implied.
Files are created u+rw, and directories u+rwx, minus umask restrictions.
-d, --directory create a directory, not a file
-u, --dry-run do not create anything; merely print a name (unsafe)
-q, --quiet suppress diagnostics about file/dir-creation failure
--suffix=SUFF append SUFF to TEMPLATE; SUFF must not contain a slash.
This option is implied if TEMPLATE does not end in X
-p DIR, --tmpdir[=DIR] interpret TEMPLATE relative to DIR; if DIR is not
specified, use $TMPDIR if set, else /tmp. With
this option, TEMPLATE must not be an absolute name;
unlike with -t, TEMPLATE may contain slashes, but
mktemp creates only the final component
-t interpret TEMPLATE as a single file name component,
relative to a directory: $TMPDIR, if set; else the
directory specified via -p; else /tmp [deprecated]
--help display this help and exit
--version output version information and exit
GNU coreutils online help: <https://www.gnu.org/software/coreutils/>
Full documentation at: <https://www.gnu.org/software/coreutils/mktemp>
or available locally via: info '(coreutils) mktemp invocation'
For python -c "from cgatcore import pipeline as P; P.get_parameters(); print(P.get_params()['cluster_tmpdir'])"
I get:
False
See the full output and scripts if needed: test.qsub.e582543.txt test.qsub.o582543.txt test.qsub.txt test.sh.txt
When using pipeline.yml.txt
(Note the txt extension is to be able to drag and drop here)
python -c "import os; from cgatcore import pipeline as P ; P.get_parameters(['pipeline.yml']); print(P.get_params()['cluster_tmpdir'])"
False
This is from an old simple ruffus toy pipeline from CGAT (simple test repo).
Note I sometimes use a .cgat.yml file
but here have not passed cluster_tmpdir
. Adding eg
cluster:
tmpdir: /somewhere/
in either .cgat.yml
or pipeline.yml
files will print the value.
Best, Antonio
I forgot to add:
Setting cluster_tmpdir
in one of the yml files will give a similar error:
mktemp: failed to create directory via template ‘/tmp/tmp.XXXXXXXXXX’: Permission denied
I haven't looked at this properly but can send more info. Best, Antonio
Hi Antonio,
When I place your pipeline.yml
file in my working directory with:
######################################################
# pipeline_template.py configuration file
# Add pipeline specific options into separate sections
######################################################
# General options:
general:
author_name:
project_name:
licence:
version:
# Pipeline specific options:
pipeline_template:
parameter_x:
parameter_y:
cluster:
tmpdir: /var/tmp/
Then, running the command
python -c "import os; from cgatcore import pipeline as P ; P.get_parameters(['pipeline.yml']); print(P.get_params()['cluster_tmpdir'])"
Prints out: /var/tmp/
Since the path to the temp folder in PBS
is dynamically stored in TMPDIR
, could you please try using a pipeline.yml
with the following instead:
cluster:
tmpdir: $TMPDIR
and let me know how it goes.
Best regards, Sebastian
Hi Sebastian, I get:
python -c "import os; from cgatcore import pipeline as P ; P.get_parameters(['pipeline.yml']); print(P.get_params()['cluster_tmpdir'])"
$TMPDIR
Patience is a virtue.... Thanks and apologies! I had already tried this but got in a mix with other errors. Regardless, this works with the ctmXXX.sh scripts when qsub'ing them (the temp files are created and scripts do not error).
With pipelines on the cluster I'm still getting the second error though ValueError: too many values to unpack (expected 2)
.
I'll close this now and report separately once I look a bit more into it. It is a problem my side though as I get it with different pipelines.
Thanks again,
Antonio
Hello, I installed cgatcore with:
bash install-CGAT-tools.sh --devel --location /rds/general/project/medbio-berlanga-group/live/apps/bin/cgat_installation_2
and ran the tests with:
./install-CGAT-tools.sh --test --location /rds/general/project/medbio-berlanga-group/live/apps/bin/cgat_installation_2
but got the following errors:
I've attached the tests output: tests.output.txt Errors seem to be either:
mkdir: cannot create directory ‘/tmp/ctmpsu9heyez’: Permission denied
orValueError: too many values to unpack (expected 2)
The first is probably related to #30. I'm unsure how to fix this though. The cluster I'm on is PBSPro and submission nodes use $TMPDIR, which points to /var/tmp/ and would create eg /var/tmp/pbs.job_id.
Any help would be much appreciated! Best, Antonio