Open rohin-dasari opened 3 years ago
NNI does not support setting namespace in configuration yet, but you could change NNI source code for your requirement, and build NNI manually. change code here: https://github.com/microsoft/nni/blob/02eab99b9bb280e385a7af71b4c6c4a73bae1f04/ts/nni_manager/training_service/kubernetes/kubeflow/kubeflowTrainingService.ts#L373 Build and install NNI locally: https://github.com/microsoft/nni/blob/master/docs/en_US/Tutorial/InstallationLinux.rst#build-wheel-package-from-nni-source-code
Thanks for the response.
I made the change and rebuilt from source, but the error still persists. Is there another specific place where a code change is required? I looked around the nni/ts/nni_manager/training_service/kubernetes/kubeflow/
directory and found some other references to the default
namespace in kubeflowApiClient.ts
. I swapped those values out with our namespace, but that raised a new issue:
[2021-04-30 14:06:13] ERROR [ 'Error: 404 page not found\n\n at _request (/home/local/TECHNICALABS/rdasari/miniconda3/lib/python3.8/site-packages/nni_node/node_modules/kubernetes-client/lib/backends/request.js:189:25)\n at Request.request [as _callback] (/home/local/TECHNICALABS/rdasari/miniconda3/lib/python3.8/site-packages/nni_node/node_modules/kubernetes-client/lib/backends/request.js:148:14)\n at Request.self.callback (/home/local/TECHNICALABS/rdasari/miniconda3/lib/python3.8/site-packages/nni_node/node_modules/request/request.js:185:22)\n at Request.emit (events.js:198:13)\n at Request.
@rohin-dasari did you fixed the issue? Error: 404 page not found
this error seems caused by kubernetes environment, could you double check if you could run kubeflow job successfully without NNI?
I was able to confirm that the namespace I am pointing NNI to does in fact exist. I will work on getting a Kubeflow job to run successfully on the namespace and get back to you.
@rohin-dasari Any updates? Thank you!
Environment:
Log message:
What issue meet, what's expected?: NNI can't start a Kubeflow training service in our Kubernetes cluster without belonging in the "default" namespace. We would like to avoid giving NNI "default" permissions since we have other tasks and jobs running in our Kubernetes cluster and would like to keep the NNI optimization tasks as independent of our other jobs as possible. Is it possible to specify the namespace NNI uses when it runs using the Kubeflow plugin?