Open wkc1986 opened 1 year ago
Please open up hic.wdl
and manually add gpu
attribute (not gpuCount
) to runtime
block of two hiccups tasks:
https://github.com/ENCODE-DCC/hic-pipeline/blob/d8e821daef5e9ec996a008e372d30a28c57c0008/hic.wdl#L1031
https://github.com/ENCODE-DCC/hic-pipeline/blob/d8e821daef5e9ec996a008e372d30a28c57c0008/hic.wdl#L1084
runtime {
...
gpu: 1
...
}
That looks like a Singularity issue. Please post your call-delta/execution/stderr
and also stdout
too if possible.
Hi Jin-wook, thanks for quick reply. I edited hic.wdl
to put gpu: 1
in both hiccups
and hiccups_2
, and indeed the sbatch
command now has --gres=gpu:1
, however the task still fails the same way. Here's call-hiccups_input_hic/execution/stderr
:
Picked up _JAVA_OPTIONS: -Djava.io.tmpdir=/mnt/gsfs0/shared-collab/gecollab/hic/encode_hic-pipeline/hic/085f84e0-0790-4387-af97-b74e34b74f2f/call-hiccups_input_hic/tmp.97dfeaae
Warning Hi-C map may be too sparse to find many loops via HiCCUPS.
jcuda.CudaException: Could not prepare PTX for source file '/mnt/gsfs0/shared-collab/gecollab/hic/encode_hic-pipeline/hic/085f84e0-0790-4387-af97-b74e34b74f2f/call-hiccups_input_hic/tmp.97dfeaae/temp_JCuda_3956590174754731503.cu'
at jcuda.utils.KernelLauncher.create(KernelLauncher.java:389)
at jcuda.utils.KernelLauncher.create(KernelLauncher.java:321)
at jcuda.utils.KernelLauncher.compile(KernelLauncher.java:270)
at juicebox.tools.utils.juicer.hiccups.GPUController.<init>(GPUController.java:72)
at juicebox.tools.clt.juicer.HiCCUPS.buildGPUController(HiCCUPS.java:558)
at juicebox.tools.clt.juicer.HiCCUPS.runCoreCodeForHiCCUPS(HiCCUPS.java:485)
at juicebox.tools.clt.juicer.HiCCUPS.access$200(HiCCUPS.java:158)
at juicebox.tools.clt.juicer.HiCCUPS$1.run(HiCCUPS.java:414)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: java.io.IOException: Cannot run program "nvcc": error=2, No such file or directory
at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1128)
at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1071)
at java.base/java.lang.Runtime.exec(Runtime.java:592)
at java.base/java.lang.Runtime.exec(Runtime.java:416)
at java.base/java.lang.Runtime.exec(Runtime.java:313)
at jcuda.utils.KernelLauncher.preparePtxFile(KernelLauncher.java:1113)
at jcuda.utils.KernelLauncher.create(KernelLauncher.java:385)
... 10 more
Caused by: java.io.IOException: error=2, No such file or directory
at java.base/java.lang.ProcessImpl.forkAndExec(Native Method)
at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:340)
at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:271)
at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1107)
... 16 more
GPU/CUDA Installation Not Detected
Exiting HiCCUPS
The call-delta/execution/stderr
is just the line from my first post. The stdout
is empty.
Looking more at this, I believe the issue is that on our HPC CUDA needs to be loaded via the module
system, otherwise it can't find nvcc
. But neither module
nor adding the CUDA directory to the path works in the container. Also, according to docker/hiccups/Dockerfile
, shouldn't it be using a NVIDIA image that would already have nvcc
?
How does one get nvcc
in the container if it isn't already there?
I met this issue too.
Possibly solved. The hiccups
and delta
tasks had their own docker
s specified in hic.wdl
, but their singularity
s were set to the main Docker image which does not have GPU stuff. So in hic.wdl
I copied the line for hiccups_docker
in workflow hic { input {
to add this line:
String hiccups_singularity = “docker://encodedcc/hic-pipeline:1.15.1_hiccups”
and changed this in hiccups_runtime_environment
:
”singularity” : hiccups_singularity
and successfully ran hiccups
. I assume the same will work for delta
.
Describe the bug
call-hiccups_input_hic
failed, apparently because GPU resources not requested. Similar situation forcall-delta
.OS/Platform
Caper configuration file
Input JSON file
call-hiccups_input_hic/execution/stderr
ends withLooking at
call-hiccups_input_hic/execution/script.submit
, thesbatch
call doesn't have--gres=gpu:1
which I'm guessing would be necessary. Same withcall-delta/execution/script.submit
. Theslurm-partition
specified should in fact have GPUs.In addition,
call-delta/execution/stderr
contains/usr/bin/python: can't find '__main__' module in ''