Open trsludwig opened 4 years ago
Hi @trsludwig,
Unfortunately GPU options is not currently supported for ParallelCluster-AWSBatch integration. I will mark this issue as a feature request. To run GPU jobs using AWSBatch console or CLI directly, please see the official documentation here
If you are still interested in using ParallelCluster for GPU workflow, please check out ParallelCluster's integration with Slurm, of which we do support specifying GPU option in job submission commands.
Hope that helps!
Hi @rexcsn,
thank you for your quick response.
Could you please include it in the official documentation that the awsbatch scheduler cannot process jobs with GPU support. Then please include a reference to the Slurm Scheduler.
I hope that you will implement this enhancement very soon, since AWS Batch itself has the ability to run GPU-based jobs.
many thanks in advance and stay healthy!
best regards Sebastian
Environment:
Bug description and how to reproduce: after creating an awsbatch based GPU environment, the job cannot be done because a GPU cannot be found. AWS Batch creates an ECS Container with an gpu-based ami, but the Job searches for ja free GPU and does not found one.
here is the log of the job:
as you can see in the last part our script tries to find a free GPU and cannot find one. As a result of this error the script echos that it is using a CPU instead.
Does i have to set a special parameter to use aws-parallelcluster for an awsbatch CUDA based GPU Environment? I can't find any documentation or manuals on how to create a gpu-based cluster via aws-parallelcluster using awsbatch as scheduler and what is the best configuration for it.
the pcluster-config is the following: