Open FranckUltima opened 8 months ago
@FranckUltima thanks for raising this proposal. I think the pixel capacity is a better measurement, then max GPUs/sessions. Another thing we could consider is defining a "unit of work", like you have vCPU in cloud providers. I mention this, because with the Livepeer AI work in place, we may need to define some abstract unit of capacity.
I think it should be configurable by decoded and encoded pixels as OR rule because gpus have different chip number. If only maxencoded pixel flag is used it might be not saturated before decoded pixels capacity is reached and your gpu crashes. By the way decoded pixels might be also incorporated to the payment for work system.
I think every O would really like to see an automated way to raise and lower maxSessions based on metrics available. With the new client, GPUs are crashing due to the limitation of the maxSessions flag.
Currently, orchestrators must determine the encoding and decoding capabilities of their nodes themselves, based on the GPU type and a benchmark. However, it is extremely difficult to estimate how many streams an orchestrator can accept before going out of real time.
Indeed, should we estimate the number of streams based on a 4K to 1080p/720/480/360p profile (30 or 60 fps) or on a 1080p to 720p 480p or even a 720p to 480p profile? The number of streams acceptable by a GPU can then vary considerably, and underestimating it can lead to loss of real time and a decrease in the quality of service for customers.
Could we consider defining the limits of GPUs instead by the quantity of pixels processed? This way, if a new job exceeds the current capacity of the GPU, it would be rejected.
I believe it would be more accurate to estimate the number of pixels that a GPU can process (even if this can also vary depending on the codec, etc.) and this would allow us to make the best use of the orchestrators' capabilities without risking going out of real time and impacting the quality of service for customers.