Open yangcheng-dev opened 8 months ago
Has this problem already been solved?
It seems that after deploying the gpumanager project, there is an issue with controlling GPU memory. I have set the pod quotas as follows, but the control over GPU memory and computing power does not take effect. What's even more peculiar is that if I wait for the same pod for over 60 minutes, the restrictions are likely to take effect. There are no obvious errors in the gpumanager logs. resources: limits: tencent.com/vcuda-core: "50" tencent.com/vcuda-memory: "32" requests: tencent.com/vcuda-core: "50" tencent.com/vcuda-memory: "32"
Just remove the cuGetProcessAddress implement,it will cause this problem.
It seems that after deploying the gpumanager project, there is an issue with controlling GPU memory. I have set the pod quotas as follows, but the control over GPU memory and computing power does not take effect. What's even more peculiar is that if I wait for the same pod for over 60 minutes, the restrictions are likely to take effect. There are no obvious errors in the gpumanager logs. resources: limits: tencent.com/vcuda-core: "50" tencent.com/vcuda-memory: "32" requests: tencent.com/vcuda-core: "50" tencent.com/vcuda-memory: "32"