Open xalteropsx opened 2 weeks ago
Running Gardio app takes only about 8G VRAM in our test. Your description about the issue is not sufficient enough. Can you provide more details for me to further check your problem?
Running Gardio app takes only about 8G VRAM in our test. Your description about the issue is not sufficient enough. Can you provide more details for me to further check your problem?
sorry for late reply last day was little fever i will do it now will provide u sample video and some changes what i made brb it take 10-30min
as u can see from screenshot i clone ur current repo and change this line
default="runwayml/stable-diffusion-inpainting" - > default="benjamin-paine/stable-diffusion-v1-5-inpainting"
it hard to make video at such large gpu consuming even freezing my display i will reduce the resolution so it can reduce resolution since
1024 * 768 - 42gb nearly let me check what number should be good for 20 vram resolution so i can show u proof
According to the information you provided, you are using the AMD GPU. Machine learning tasks usually require Nvidia GPU to accelerate with CUDA, otherwise it will be performed at a very slow speed. In addition, your 46.5G VRAM is likely to be occupied by other applications. Maybe you can view it through process manager.
u can download the video and check it its impossible vram consume by it if u want proof i can google remote video host or google meet or provide me something i can share my screen with u? dont bother about slow speed bro all i want is less vram if possible
I have checked the video you sent, and indeed there is a 40G memory usage. I suspect the reason is that AMD GPUs do not support low-precision or mixed-precision inference, or the current code cannot effectively utilize the AMD GPUs, leading to increased memory usage and slow speed. However, I have too little experience with machine learning on AMD GPU to provide suggestions for reducing high memory usage. Perhaps you can seek help from relevant communities to see if there are methods to reduce memory consumption and accelerate the process. I have modified the title of this issue so that others can view it and offer assistance.
i will try to investigate if i know anything or may be ask someone who using amd for this question as well but was using zluda torch in window which check in linux on pure torch and will let u know the feedback again
i using bf16 and it occupy more then 40gb vram and very slow