Open greeksharifa opened 2 weeks ago
Use the Image optimization to do the minimal usage of GPU during inferencing.The code for the image optimization is as follow:
min_pixels = 256 * 28 * 28
max_pixels = 1280 * 28 * 28
processor = AutoProcessor.from_pretrained(
"Qwen/Qwen2-VL-7B-Instruct", min_pixels=min_pixels, max_pixels=max_pixels
)
If run this code:
Then OutofMemoryError occurs:
in 6 x A6000 GPUs.
The
Tried to allocate 538976288.13 GiB.
message in error message is such ridiculous numeric.p.s. following code works, in our GPU server(this code works in only 1 Nvidia A6000).