Open GlassBil opened 1 month ago
Hi @GlassBil , we currently do not offer an ECS Optimized AL2023 GPU AMI. Have you tried using the ECS Optimized AL2 (or AL2 5.10) GPU AMI?
https://docs.aws.amazon.com/AmazonECS/latest/developerguide/ecs-optimized_AMI.html
Hello @GlassBil ,
I tested at my lab and I found workaround that may be dangerous to implement for any critical workload, but just in case if you want to test until something officially comes out, do let me know .
Summary
/var/lib/ecs/gpu/nvidia-gpu-info.json
isn't automatically being generated since ECS v1.83.0.Description
Running Amazon Linux 2023 (ECS Optimized) AMI on EC2 gd4n.xlarge. On version 2023.4.20240528 (Amazon ECS Agent - v1.82.4) it automatically generates
/var/lib/ecs/gpu/nvidia-gpu-info.json
when ECS GPU support is enabled. However, since version 2023.4.20240611 (Amazon ECS Agent - v1.83.0) this no longer happens. It generates thegpu
directory inside/var/lib/ecs
but no file is generated inside the directory.Expected Behavior
Expect
nvidia-gpu-info.json
to be generated so that Docker can use the GPU. This leads to the docker being unable toObserved Behavior
gpu
directory is created, but the file is not.Environment Details
gd4n.xlarge Amazon Linux 2023 2023.4.20240611 or newer GPU drivers: NVIDIA-Linux-x86_64-550.54.14.run
Supporting Log Snippets
ecs-agent docker logs:
system /var/log/ecs/ecs-init.log
Can email complete logs if needed.