Open DuckersMcQuack opened 1 year ago
Hey, sorry for the late reply, this is possible although a bit unconventional due to the huge speed differences. You would want to have two models, one on cpu and the other on gpu did I get that right?
Hey, sorry for the late reply, this is possible although a bit unconventional due to the huge speed differences. You would want to have two models, one on cpu and the other on gpu did I get that right?
If that's what it takes to get the pooling of cpu as well, sure! Got 64GB ram, so plenty to take from :) As i wanna see for myself what "speed increase" that would achieve.
RAM wouldn't be the bottleneck here tho, what would happen is that you get the GPU model output in a few seconds for say image_0 and then you're stuck waiting for the CPU models to finish computation for image_1, image_2...
What I would rather consider is adding an optimized model for CPU inference only..
I plan to get another 3090, but if it is possible to "pool" cpu and gpu performance and allocate 24GB ram as well as a "shared same amount each", i'd love to try that! Might be dreadfully slow due to "potato cpu vs gpu performance", but worth an experimentation!
Cpu in question here is a 5900x.