issues
search
AliyunContainerService
/
gpushare-device-plugin
GPU Sharing Device Plugin for Kubernetes Cluster
Apache License 2.0
468
stars
144
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
ALIYUN_COM_GPU_MEM_IDX in the annotation is different than ALIYUN_COM_GPU_MEM_IDX inside the pod
#61
wokalski
closed
11 months ago
1
Can it take effect on the window node?
#60
RowgerGo
opened
1 year ago
0
节点有多个不同型号GPU(显存也不一致)时会以第一个识别到的GPU为准
#59
SakuraAxy
opened
1 year ago
0
feat: upgrade go version && go mod
#58
swartz-k
opened
1 year ago
1
ResourceExhausted desc = grpc: received message larger than max (4986010 vs. 4194304)
#57
k0nstantinv
opened
1 year ago
0
trivy image scan lists critical and high vulnerability against latest image k8s-gpushare-plugin:v2-1.11-aff8a23
#56
carlwang87
opened
1 year ago
0
NVIDIA_VISIBLE_DEVICES wrong value in OCI spec
#55
k0nstantinv
opened
1 year ago
0
该程序在 k8s .1.25中无法使用
#54
blankxyz
opened
2 years ago
0
support for vgl
#52
umberto10
opened
2 years ago
0
containerd and nvidia-container-runtime instead of nvidia-docker2
#51
Frank-17
opened
2 years ago
2
set gpu index rather than gpu uuid for env NVIDIA_VISIBLE_DEVICES
#50
happy2048
closed
2 years ago
0
No Devices found. Waiting indefinitely.
#49
clennpillo
opened
2 years ago
1
exit the device plugin when creating failed
#48
happy2048
closed
2 years ago
0
插件能获取GPU的个数,但是获取不了GPU的显存,共享无法调度
#47
gxwangit
opened
3 years ago
2
skip to search pod when count of gpu devices is only one
#46
happy2048
closed
3 years ago
0
How to install on Mac?
#45
2811299
opened
3 years ago
0
fix kubelet pod annotation wrong
#44
RongbinZ
opened
3 years ago
0
Question: request gpushare on the same GPU
#43
xhejtman
opened
3 years ago
3
fix: modify the naming method to reduce the gRPC's request bytes
#42
qmloong
closed
3 years ago
2
fix #39: get pods from kubelet client rather than list cluster-scope pods from apiserver
#41
qmloong
closed
3 years ago
4
修复大量list all-namespaces pods的缺陷、MiB单位下名称长度可能引起grpc调用失败从而导致node的gpumem资源清0
#40
qmloong
closed
3 years ago
2
集群内pod数量过多的情况有可能会引起集群高负载从而雪崩,另外MiB单位有可能会引起kubelet grpc单位失败
#39
qmloong
closed
3 years ago
0
fix bug: update pod error when allocate
#38
tzzcfrank
closed
3 years ago
0
PLEASE GIVE US A BINARY
#37
mariusehr1
closed
3 years ago
0
GPU device not detected with nvidia driver > 430.XX
#36
ptonelli
closed
3 years ago
1
节点重启后,发现gpu显存超分了
#35
zlingqu
opened
3 years ago
0
fix bug: display with multiple devices
#34
happy2048
closed
3 years ago
0
kubectl-inspect-gpushare supports gpushare2.0
#33
happy2048
closed
3 years ago
0
detect node label to disable cgpu
#32
happy2048
closed
4 years ago
0
Metrics
#31
dontmint
closed
4 years ago
0
GPU registered to Kubelet but not available in `kubectl inspect gpushare` and not schedulable when using --memory-unit=Mib
#30
dontmint
opened
4 years ago
0
Allocate 中getCandidatePods是否多此一举?
#29
malixian
opened
4 years ago
0
Unable to schedule pod with: Insufficient aliyun.com/gpu-mem
#28
k0nstantinv
closed
3 years ago
1
kubectl-inspect-gpushare: fatal with OpenID as auth-provider
#27
k0nstantinv
opened
4 years ago
0
Concurrently create sharegpu instance will cause creation to fail
#26
guunergooner
opened
4 years ago
0
some problem about auto allocate GPU card
#25
guobingithub
opened
4 years ago
1
How to guarantee the pod does not use more memory than allocation?
#24
hyc3z
opened
4 years ago
1
nvidia-container-cli: device error: unknown device id: no-gpu-has-256MiB-to-run\\\\n\\\"\"": unknown
#23
zhaogaolong
opened
4 years ago
3
Update api version
#22
danieltanyouzu
closed
4 years ago
1
Fix for kubernetes 1.17
#21
danieltanyouzu
closed
4 years ago
0
update base image
#20
happy2048
closed
4 years ago
0
No assume timestamp for pod tf-jupyter-... so it's not GPUSharedAssumed assumed pod.
#19
jear
closed
4 years ago
2
make missed error be handled
#18
mozhata
closed
4 years ago
1
device plugin failed to detect gpu info correctly
#17
pan87232494
closed
4 years ago
7
[问题] Device Plugin allocate 选出的 pod 是否会跟 Kubelet 绑定的不一致
#16
orainxiong
opened
5 years ago
4
自动给node添加gpuType标签
#15
lunar-knights
opened
5 years ago
0
Add MPSUserGuide and integrate GPUShare with MPS
#14
Sakuralbj
opened
5 years ago
0
Add env and volume information in MPS situation.
#13
Sakuralbj
opened
5 years ago
1
question about pods with multiple containers
#12
monstercy
opened
5 years ago
3
Can't pull your docker image
#11
morecoffee101
opened
5 years ago
1
Next