tensorflow / ecosystem

Integration of TensorFlow with other open-source frameworks
Apache License 2.0
1.37k stars 392 forks source link

Add parameter server train & side-car eval on k8s #182

Open selcukgun opened 3 years ago

selcukgun commented 3 years ago

ResNet56 model (with custom training loop) variables are created on parameter server jobs, and updated by workers. Evaluation is done using a dedicated job which uses the checkpoints saved during the training (side-car evaluation).

The model is trained on CIFAR10 dataset.