Closed Lenny4 closed 1 year ago
DFL should run on GPU and in your case GPU has VRAM: 0.02GB this is why u see OOM errors
Hi @zabique , please look at https://aws.amazon.com/ec2/instance-types/g4/#Product_Details, I have way more vRam than 0.02GB on a g4dn.4xlarge instance
I just say what tensorflow is reporting, you can see that too.
Yes it was my conclusion too
I have an error telling me that Resource exhausted while I have 64Go memory. If you look the Model Summary it says VRAM: 0.02GB I have way more VRAM on the g4dn.4xlarge instance.
Do you know how I can fix that ?
sorry I look stupid now, you clearly know that this is the problem, not RAM itself. I always install it as conda env in normal linux instance, never used Docker.
I launched an
g4dn.4xlarge
(with Ubuntu) instance on AWS hoping to use DeepFaceLab on it.I successfully installed the
nvidia-container-toolkit
, and launched a docker (which contains all the dependencies to run Deepfacelab) on it with this command:The container launch successfully. I start extracting images and faceset. Then comes the training phase, here is the output:
I have an error telling me that
Resource exhausted
while I have 64Go memory and 16VRAM. If you look theModel Summary
it saysVRAM: 0.02GB
I have way more VRAM on theg4dn.4xlarge
instance.What's the problem here ?
Why can't I use all the VRAM available from the host in my docker ?