Closed r0f1 closed 4 years ago
Google LS assigned spawn a separate VM for each job and therefore container. don't think it's related to that.
Moreover, note the docker
and containerOptions
are ignored by LS executor.
Thank you for your response. I think the reason for the error I am getting is related to the last sentence you wrote. I suspect that even though I am able to spawn a VM with a GPU with Google LS, the docker container is not able to use that GPU. I looked around but could not find an example of someone using a GPU with Google LS. Maybe I will write a short example to verify my hypothesis.
Or is there a way of passing the docker
options and the containerOptions
to Google LS? Or can you pass these arguments somehow inside the Dockerfile?
Maybe @moschetti @hnawar know more
Hi Florian, I've tried create simple process that uses GPU from NextFlow, I can run nvidia-smi and can see the GPU. I tried to run a simple python code but ran into some errors due to missing cudatoolkit. I'll try to add that to my Dockerfile and see if I can get it tow work. Here is my current Dockerfile
from nvidia/cuda:10.2-base RUN apt-get update && apt-get install -y --no-install-recommends \ python3.5 \ python3-pip \ && \ apt-get clean && \ rm -rf /var/lib/apt/lists/* RUN pip3 install numpy matplotlib RUN pip3 install numba
Actually it now worked, I just had to use the nvidia/cuda:latest instead of the 10.2-base and here is my output which is a modification on an online mini benchmark of CPU v GPU without GPU: 10.049912896000023 with GPU: 3.557580697999981
Thanks for your help. I really appreciate the fast response and helpful comments. I got it to work now, but honestly I don't know what actually did the trick. I am listing a bunch of changes that I did for future googlers. My dockerfile now uses nvidia/cuda:10.0-cudnn7-devel
. I am using AlexeyAB's darknet which I compile inside the dockerfile, but the compilation now takes place in a separate docker image in the same file and the results are then copied over. In my nextflow.config
I defined a label and specified machineType = "n1-standard-8"
under that label and in my process definition I specified cpu 8
so that nextflow is forced to use separate machines. The docker options and containerOptions I left unchanged.
Thanks!
Just a quick note, I ran the above with Google Life Sciences Executor and not the docker executor. With the docker option it will try to run multiple processes on the same machine to maximise the CPU usage.
Hi, consider this following code:
There is actually another process before process
hello
, that causeshello
to be spawned several times in parallel. If this other process spawns only one instance ofhello
and therefore only one processhello
is executed at a time, everything runs fine. If I add the directivemaxForks 1
tohello
, everything runs fine. However, if multiplehello
s are run in parallel, I get an error from my python script, that there is not enough memory available.How can I ensure that nextflow is scheduling exactly one instance of
hello
to one physical machine?Version: