Closed NohaIhab closed 1 day ago
Thank you for reporting us your feedback!
The internal ticket has been created: https://warthogs.atlassian.net/browse/KF-5773.
This message was autogenerated
With a quick look over the Training Operator code it looks like the operator can be parameterised for images of different init containers: https://github.com/kubeflow/training-operator/blob/v1.7-branch/cmd/training-operator.v1/main.go#L89-L98
Although, they don't set any actual/usable value in the code. And also don't set these in the manifests https://github.com/kubeflow/manifests/tree/v1.9-branch/apps/training-operator/upstream/base
So it's not yet clear when someone might need to set those values. We'll need to confirm
we'll need to define a manual test because in order to do it programmatically in a notebook we'd have to pip install kubeflow-training
package since it is not pre-installed in the notebook server images. This we cannot do in airgapped. The other option is to create our own notebook image that includes the kubeflow-training
python package - I'd like to avoid adding another image for the team to maintain if we can have a manual test with simple instructions for now.
I tracked down the image used in the TFJob example in our UATs, it is using the input_data
function from the tensorflow.examples.tutorials.mnist
package, and the image has base tensorflow/tensorflow:1.11.0
.
Looking at the input_data
function source code, there is fake_data
boolean argument that is exposed in the mnist_with_summaries.py
module that generates some dummy values to test with.
We can:
fake_data
argument to avoid downloading datadata_dir
argument to copy the data into the airgapped env and using it from local instead of downloading it.I believe 1.
is sufficient as we don't need to test with real data, all we need to cover in this task is the training operator functionality, whether it is real or fake data should not affect this.
Context
Currently, there is no defined test for training operator in an airgapped environment. We need to define and document the testing process.
What needs to get done
Definition of Done
The testing process for training operator in airgapped is defined and documented.