Closed praveen049 closed 4 years ago
Hi @praveen049
Could you attach here the tarball created by microk8s.inspect
. Could you also share the logs you get with microk8s.juju debug-log -n 2000
. @knkski can you think of anything else it could help us figure out what might be wrong here?
@ktsakalozos
Attaching it here. inspection-report-20191024_054807.tar.gz
microk8s.juju debug-log -n 2000
gives ERROR Gateway Timeout
.
I am behind a proxy and no_proxy is set correctly.
@ktsakalozos
Any suggestions on how i can troubleshoot this issue ?
Thanks
@praveen049: Are you able to run any microk8s.juju
commands at all? Can you post microk8s.juju status
if it runs successfully?
@knkski
i have tried couple of commands
microk8s.juju status
and microk8s.juju users
and they both hang
@praveen049: Can you try microk8s.juju status --debug
and see if you get any output? Otherwise, can you post the logs from the juju controller pod?
@knkski
Output of microk8s.juju status -debug
(base) sims@kubeflow:~$ microk8s.juju status --debug
04:47:50 INFO juju.cmd supercommand.go:79 running juju [2.7-rc1 gc go1.10.4]
04:47:50 DEBUG juju.cmd supercommand.go:80 args: []string{"/var/snap/microk8s/946/bin/juju", "status", "--debug"}
04:47:50 INFO juju.juju api.go:67 connecting to API addresses: [10.152.183.246:17070]
The output of kubectl describe pods -n controller-uk8s
is attached
juju_pod2.txt
@praveen049: Can you also post the logs from that pod? It looks like it's running normally.
@praveen049: If nothing else, can you try microk8s.disable kubeflow
, or microk8s.juju unregister -y uk8s
if that doesn't work, then trying microk8s.enable kubeflow
again?
@knkski
i have reinstalled microk8s from 1.16/edge/kubeflow
channel. Previously it was installed with 1.14/stable
and then switched channel to 1.16/edge/kubeflow
Attached are the logs from the mongodb and api-server pods apiserver.txt mongodb.txt
This time the microk8s.enable kubeflow
returns with the below error
(base) sims@trainer:~$ microk8s.enable kubeflow
Enabling dns...
Enabling storage...
Enabling dashboard...
Enabling juju...
Deploying Kubeflow...
Creating Juju controller "uk8s" on microk8s/localhost
Creating k8s resources for controller "controller-uk8s"
Downloading images
Starting controller pod
Bootstrap agent now started
Contacting Juju controller at 10.152.183.89 to verify accessibility...
ERROR unable to contact api server after 1 attempts: Gateway Timeout
Command '('microk8s-juju.wrapper', 'bootstrap', 'microk8s', 'uk8s')' returned non-zero exit status 1
Failed to enable kubeflow
And trying the disable and unregister does not help
(base) sims@kubeflow:~$ microk8s.disable kubeflow
ERROR controller uk8s not found
Command '('microk8s-juju.wrapper', 'destroy-controller', '-y', 'uk8s', '--destroy-all-models', '--destroy-storage')' returned non-zero exit status 1
Failed to disable kubeflow
(base) sims@kubeflow:~$ microk8s.juju unregister -y uk8s
ERROR controller uk8s not found
@praveen049: Could you try updating the snap (sudo snap refresh microk8s
), and running KUBEFLOW_DEBUG=true microk8s.enable kubeflow
? That will add in the --debug
flag to Juju, which should help diagnose what's going on here.
@knkski Attached is the log with the debug option. juju_status_error_debug.txt
@knkski Any pointers on how to troubleshoot and fix this issue ?
Thanks
@praveen049: Sorry about the wait. It looks like you've got a proxy issue. Can you either try it without the proxy involved, or post the output from this command?
microk8s.juju --debug bootstrap microk8s --config juju-no-proxy=10.0.0.1
@knkski thank you for the feedback.
Attached is the output juju-noproxy.txt
the commands i used:
KUBEFLOW_DEBUG=true microk8s.enable kubeflow
This fails as before and then
microk8s.juju --debug bootstrap microk8s --config juju-no-proxy=10.0.0.1
Thanks
@praveen049: It looks like the manual bootstrap command worked for you, so I've added in the flag that should fix things for you in PR #785.
@knkski thank you for the fix.
so, i need to deploy again from the channel and enable kubeflow with the below commands ?
sudo snap install microk8s --classic --channel 1.16/edge/kubeflow
microk8s.enable kubeflow
@knkski Based on the discussion on the thread for PR 785, it seems the fix was not merged. Is there any other solution or workaround to get it working ?
Thanks
installing from 1.16/edge/kubeflow channel worked for me 2 days ago
sudo snap install microk8s --classic --channel 1.16/edge/kubeflow
microk8s.enable kubeflow
but now i am getting this error right now
KUBEFLOW_DEBUG=true sudo microk8s.enable kubeflow
Enabling dns...
Enabling storage...
Enabling dashboard...
Enabling rbac...
Enabling juju...
Deploying Kubeflow...
Located bundle "cs:bundle/kubeflow-134"
ERROR cannot deploy bundle: the provided bundle has the following errors:
empty charm path
invalid charm URL in application "ambassador-auth": cannot parse URL "": name "" not valid
Command '('microk8s-juju.wrapper', 'deploy', 'kubeflow', '--channel', 'stable', '--overlay', '/tmp/tmpnmhsn4l0')' returned non-zero exit status 1
Failed to enable kubeflow
@charlesa101: Apologies, can you run sudo snap switch microk8s --channel edge && sudo snap refresh
? I think that particular channel is no longer getting updated and will disappear due to the feature getting merged into master.
@knkski I have now tried with the new channel and the proxy issue seems to be resolved. Thank for that.
I have running into a different error :
`Resolving charm: cs:~kubeflow-charmers/seldon-cluster-manager-47 Resolving charm: cs:~kubeflow-charmers/tensorboard-46 Resolving charm: cs:~kubeflow-charmers/tf-job-dashboard-48 Resolving charm: cs:~kubeflow-charmers/tf-job-operator-46 ERROR cannot deploy bundle: cannot add charm "cs:~kubeflow-charmers/ambassador-47": cannot retrieve charm "cs:~kubeflow-charmers/ambassador-47": cannot get archive: Get https://api.jujucharms.com/charmstore/v5/~kubeflow-charmers/ambassador-47/archive?channel=stable: dial tcp: lookup api.jujucharms.com on 10.152.183.10:53: server misbehaving
Command '('microk8s-juju.wrapper', 'deploy', 'kubeflow', '--channel', 'stable', '--overlay', '/tmp/tmpu9t_xu6d')' returned non-zero exit status 1 Failed to enable kubeflow`
Attached is the full logs microk8s-19Nov.txt
Any pointers on how to resolve this ?
Thanks
@charlesa101 are you able to deploy Kubeflow from the edge channel ?
@knkski
Any suggestions on how to troubleshooting this issue ?
@praveen049 yea i was able to get this running from the edge channel
but before then i had clean up my snap directory
sudo snap switch microk8s --channel edge && sudo snap refresh
like @knkski said
then microk8s enable storage dns rbac juju kubeflow
did it for me
@charlesa101 thanks for the info
But these commands are not working for me and it's failing when deploy kubeflow with below error
ERROR cannot deploy bundle: cannot add charm "cs:~kubeflow-charmers/ambassador-47": cannot retrieve charm "cs:~kubeflow-charmers/ambassador-47": cannot get archive: Get https://api.jujucharms.com/charmstore/v5/~kubeflow-charmers/ambassador-47/archive?channel=stable: dial tcp: lookup api.jujucharms.com on 10.152.183.10:53: server misbehaving
Command '('microk8s-juju.wrapper', 'deploy', 'kubeflow', '--channel', 'stable', '--overlay', '/tmp/tmpu9t_xu6d')' returned non-zero exit status 1
Failed to enable kubeflow
I am running behind a proxy and seems to be some issue related to that.
@praveen049: Yeah, that could definitely be a proxy issue. @ktsakalozos, do you know how we should handle that?
@praveen049: Can you post the output from KUBEFLOW_DEBUG=true microk8s.enable kubeflow
? That should output some more useful information
@knkski
Attaching the debug output microk8s-debug-27Nov.txt
@wallyworld, how would we put the pods subnet in no-proxy? Looking at https://discourse.jujucharms.com/t/configuring-models/1151 no-proxy does not take a CIDR notation so it can not fit a /16 network. What about juju-no-proxy? Could we use that one?
@ktsakalozos @wallyworld @knkski
Hi, Any suggestions on how to address this proxy issue ?
Thanks
I think juju-no-proxy may work, but in practice it can be hit and miss depending on the environment in which stuff is running.
Just wanted to add in case anybody got here from Google that in my case, the problem was that I had a folder kubeflow
in my home directory from a previous installation (now on 1.17/stable) and the juju command was therefore ambiguous between cs:kubeflow and my local folder. I found this by setting KUBEFLOW_DEBUG=true and saw this message:
/build/juju/parts/juju/go/src/github.com/juju/juju/cmd/juju/application/deploy.go:1340: The charm or bundle "kubeflow" is ambiguous.
Therefore, I just change to a different directory to run and that fixed it, Kubeflow then deployed perfectly.
microk8s_kubeflow.txt Please run
microk8s.inspect
and attach the generated tarball to this issue.We appreciate your feedback. Thank you for using microk8s. I am using the
1.16/edge/kubeflow
channel and when i trymicrok8s.enable kubeflow
the command hangs at this stepAny suggestion on how can i troubleshoot this ?