issues
search
kubeflow
/
pytorch-operator
PyTorch on Kubernetes
Apache License 2.0
306
stars
143
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
feat(init_container): Add init container image CLI argument
#265
gaocegege
closed
4 years ago
9
Example PytorchJob is not starting
#264
natalytvinova
opened
4 years ago
8
kubeflow common dependency path to be updated
#263
igorvalko
opened
4 years ago
2
pin kubenertes client version to work around a bug
#262
jinchihe
closed
4 years ago
5
[feature] Support CLI argument for init container image
#261
gaocegege
closed
4 years ago
2
cleanPodPolicy Set to Running should clean Running pod
#260
xrmzju
opened
4 years ago
5
Kubernetes 1.6 support
#259
posix4e
closed
3 years ago
5
PyTorchJob worker pods crashloops in non-default namespace
#258
jobvarkey
opened
4 years ago
7
Updated the GPU compatible Docker builiding porcess with the Kubeflow…
#257
MATRIX4284
opened
4 years ago
8
Updated the Docker Image with the Latest one that uses GPU as of PR #255
#256
MATRIX4284
opened
4 years ago
3
Added The Pytorch GPU Docker under the appropriate folder
#255
MATRIX4284
closed
4 years ago
6
Link to CRD definition is broken
#254
sakaia
closed
3 years ago
4
fix: Add resource limits for init container
#253
gaocegege
closed
4 years ago
5
SDK supports getting PyTorchJob training process or logs
#252
jinchihe
closed
4 years ago
3
Copy third party vendor source code to Docker image
#251
johnugeorge
closed
4 years ago
4
Add third party license info
#250
johnugeorge
closed
4 years ago
3
Add licenses for dependencies in PyTorch Operator Image
#249
jlewi
closed
4 years ago
1
Added Pytorch Cuda Docker Image as the Image pytorch/pytorch:1.0-cuda10.0-cudnn7-runtime in not having cuda so cannot used GPU
#248
MATRIX4284
opened
4 years ago
8
published thr pytorch docker as pytorch/pytorch:1.2-cuda10.0-cudnn7-runtime docker unable to use gpu
#247
MATRIX4284
closed
4 years ago
7
Add watch function for PyTorchJob python Client API
#246
jinchihe
closed
4 years ago
9
Pytorch Docker image pytorch/pytorch:1.2-cuda10.0-cudnn7-runtime does not have cuda so unable to use GPU
#245
MATRIX4284
opened
4 years ago
1
Updated the usrer steps necessary to run the out of the box
#244
MATRIX4284
closed
4 years ago
6
v1beta foldeer has been renamed to v1 so needs the path too
#243
MATRIX4284
opened
4 years ago
7
fix the reconcile flow
#242
ChanYiLin
closed
4 years ago
5
ConvertPyTorchJobToUnstructured uses function ToUnstructured to convert PyTorchJob to Unstructured instead of json
#241
leileiwan
closed
4 years ago
7
Add more APIs for SDK
#240
jinchihe
closed
4 years ago
12
Unstructured converted to Pytorch Job Anonymous field error when json uses inline mode
#239
leileiwan
closed
4 years ago
8
replace gopkg.in/yaml.v2 with github.com/kubernetes-sigs/yaml repo
#238
xrmzju
closed
4 years ago
11
GCP preemptible instances
#237
Nintorac
opened
4 years ago
4
add cpu/mem resource limit to worker init container causes unmarshal error
#236
xrmzju
closed
4 years ago
7
fix(*) rm work service in controller_test.go
#235
leileiwan
closed
4 years ago
7
Unstructured converted to Pytorch Job Anonymous field error when json uses inline mode
#234
leileiwan
closed
4 years ago
0
feat(deletePodsAndServices):only delete master service
#233
leileiwan
closed
4 years ago
7
It will be better to avoid deleting unexisting worker services.
#232
leileiwan
closed
4 years ago
1
fix(job_test) test case should not include worker service
#231
leileiwan
closed
4 years ago
8
fix the error of Test case "TestDeletePodsAndServices"
#230
leileiwan
closed
4 years ago
10
Failed to set kubeflow in CI test.
#229
jinchihe
opened
4 years ago
0
Test case "TestDeletePodsAndServices" error
#228
leileiwan
closed
4 years ago
6
Generate Kubeflow PyTorchJob SDK
#227
jinchihe
closed
4 years ago
6
feat: Use golanglint
#226
gaocegege
closed
4 years ago
4
feat: Replace common with kubeflow/common
#225
gaocegege
closed
4 years ago
5
allocating master and work on different GPU nodes
#224
mengdong
closed
4 years ago
2
Update tf operator branch dep
#223
johnugeorge
closed
4 years ago
3
Removing v1beta2 support
#222
johnugeorge
closed
4 years ago
4
Removing unnecessary rbac permissions
#221
johnugeorge
closed
5 years ago
2
Avoiding unnecessary status update
#220
johnugeorge
closed
5 years ago
5
Right way to use pytorch-operator for multi-node multi-gpu setup
#219
lainisourgod
opened
5 years ago
13
add mnist example dockerfile for ppc64le
#218
zheddie
closed
5 years ago
12
Add mnist sample dockerfile for ppc64le
#217
zheddie
closed
5 years ago
19
Fix nslookup cannot work well in initContainerTemplate
#216
hougangliu
closed
5 years ago
4
Previous
Next