Closed slsevilla closed 1 year ago
I looked into this issue further.
It appears that ${{LOCAL}} is being defined as by the {HOST} variable, which is then defined in the config file. If the config is defining host=biowulf.nih.gov
we do not have an issue; ${{LOCAL}} is being deployed in /lscratch/
(a temporary location relative to the node the data is being run). If host has defined as host=login01
then we would have an issue, as ${{LOCAL}} is then defined as "/projects/scratch/ngspipeline{SAMPLESHEET}{NOW}${{SLURM_JOB_ID}}/".
Will ask Xinyu to confirm that projects being more recently being run on non /vf/ locations did have the host variable set in the config file correctly. It appears that this is only being done in this location.
Code block defining ${{LOCAL}} is below.
shell.prefix("""
set -e -o pipefail
#module purge
sleep 20s
if [ {HOST} == 'biowulf.nih.gov' ]
then
MEM=`echo "${{SLURM_MEM_PER_NODE}} / 1024 "|bc`
LOCAL="/lscratch/${{SLURM_JOBID}}/"
THREADS=${{SLURM_CPUS_ON_NODE}}
elif [ {HOST} == 'login01' ]
then
module load slurm
module load gcc/4.8.1
MEM=`scontrol show job ${{SLURM_JOB_ID}} | grep "MinMemoryNode"| perl -n -e'/MinMemoryNode=(\d*)G/ && print $1'`
mkdir -p /projects/scratch/ngs_pipeline_{SAMPLESHEET}_{NOW}_${{SLURM_JOB_ID}}/
LOCAL="/projects/scratch/ngs_pipeline_{SAMPLESHEET}_{NOW}_${{SLURM_JOB_ID}}/"
THREADS=`scontrol show job ${{SLURM_JOB_ID}} | grep "MinCPUsNode" | perl -n -e'/MinCPUsNode=(\d*)/ && print $1'`
fi
""")
Per Xinyu, the current host is set to 'biowulf' so this is a non-issue with the recently run data.
See email below
Hello Sam,
The host login01 is our TGen server, not related to biowulf. Our pipeline only defines two hosts, the other one is biowulf. On this end we are fine with the pipeline configuration. Please let me know if this is not clear to you.
Thanks,
Xinyu
Currently the pipeline is using the ${{LOCAL}} variable within many of its rules. This location is being used as the site for intermediary and temporary storage of files. This is not the most efficient use of disc space and should be updated to use LSCRATCH when possible. More significantly as related to issue #12 if this variable is set to a /vf/ directory then intermediary files may still be affected by the truncation of text files, regardless of whether or not the output directory is a non-/vf/ location. This will need to be immediately addressed in all rules before re-runs or new runs can occur.