Closed Kelukin closed 2 years ago
I presume you installed the 3.9 version of the odpi-egeria-lab chart ie the latest one:
➜ ~ helm search repo lab
NAME CHART VERSION APP VERSION DESCRIPTION
egeria/odpi-egeria-lab 3.9.0 3.9 Egeria lab environment
Did this work as-is?
You also mentioned you'd updated Jupiter to 3.10-SNAPSHOT. This image is based on a core Jupyter image upon which we've added our notebooks - nothing else.
As such it would probably work just fine, unless there are changes in egeria that make it incompatible. That's very unlikely, though it's not something we'd test.
If you're having issues in reaching the egeria platforms (which is what the errors you are seeing suggest), the first thing I'd check is that those pods are running ie with:
kubectl get pods
kubectl get services
If they are not you could also try
kubectl describe pod <podid>
It may be the pod is not starting due to an error - could be related to security for example, or resource constraint.
What is your k8s host environment?
In my first reply I missed one of your points - presuming you had modified your values.
As you point out, the values.yaml for that chart does indeed use 'latest' for the tag of the jupyter image.
I think we should change that back to be aligned to the release - I see it as a bug and will fix.
You can override the value locally via
helm install lab egeria/odpi-egeria-lab --set-string image.jupyter.tag=3.9
I'll check this out tomorrow, though I expect the reason your environment isn't working is different
In my first reply I missed one of your points - presuming you had modified your values.
As you point out, the values.yaml for that chart does indeed use 'latest' for the tag of the jupyter image.
I think we should change that back to be aligned to the release - I see it as a bug and will fix.
You can override the value locally via
helm install lab egeria/odpi-egeria-lab --set-string image.jupyter.tag=3.9
I'll check this out tomorrow, though I expect the reason your environment isn't working is different
Thank you, @planetf1, for your quick reply. When I override the jupyter tag to the 3.9, it works without the SSL Error issue.
When I set the Jupyter's tag to the default value, the latest
, this SSL Error issue comes out again. I deployed the lab chart in the Azure Kubernetes Service.
It is quite strange since all the services are in the running status. Besides, this failed deployment and the above successful deployment with a specific Jupyter tag happen in one AKS.
I've fixed the released charts & published 3.9.1 to correct the errors - thanks for reporting.
By using only the jupyter image as latest, I can reproduce the issue as per the original issue
To install with our current development code:
helm install lab egeria/odpi-egeria-lab --set-string egeria.version=3.10-SNAPSHOT
This also fails - even though we should be using a consistent set. I couldn't see any change in our certificates - I wonder if something is different in the container environment.
Our jupyter image is based on docker.io/jupyter/base-notebook:latest
. The main change at https://github.com/jupyter/docker-stacks/commits/master/base-notebook is a python version bump
I'll take a look at our notebooks to fix....
FYI the medium term the plan is to:
Using the 'master' version of the containers with the latest charts now works ok ie:
helm install lab egeria/odpi-egeria-lab --set-string egeria.version=3.10-SNAPSHOT
Following the PR above, which pinned the Jupyter version to the same we previously used - ie before some ssl and python library changes.
Will revisit this when we refactor the notebooks & get rid of our customized container (hopefully) - within next few months.
Thanks for the report. I think all the changes are made now so will close.
If you get any further issues on Azure let us know.
Thank you, @planetf1! Everything looks fine now.
I'm going to re-open this, as this problem still occurs when trying to run the notebooks locally.
However any fix will be rolled into the proposed changes to migrate notebooks to their own repository, and use a stock image
We should also ensure docs on running locally are updated at the same time
Re-opening and moving to base (since the charts are no longer affected since a workaround has been implemented)
Been doing some research and haven't yet found what changed - ultimately, the call seems to use the requests package https://requests.readthedocs.io/en/latest/user/advanced/#ssl-cert-verification that uses the URL3 package https://urllib3.readthedocs.io/en/stable/user-guide.html
There are some hints that the package certifi is used and that it changes often. There are additional hints that the .pem file needed for validation must contain the full chain of certs from root to intermediate to local.
As part of the refactoring I plan to no longer build a special container, but rather use the standard Jupyter containers, which will retrieve our notebooks via an init script (supported by Jupyter container) or if that fails, a k8s init container.
once that is done will address any issues with ssl certs - leaving open until then
The refactoring is mostly done. Unfortunately I wasn't able to get the certificates sorted in the time available - in part because we need to create certs at deployment time .
Nor was it easy to control the behaviour within the environment, in part due to changes in the requests module and python. There's a capability to create a session - and manage context that way which was appealing, but also too much refactoring for now.
Therefore the first pass (along with moving the code) has disabled cert checking directly in calls to the requests module
As such I think this issue is now addressed.
When I deployed a lab chart and ran common/environment-check.ipynb in the Jupyter, I triggered the following Exceptions:
I am unsure whether the cause is the unalignment between the Jupyter and Egeria's version since you specify the Jupyter tag to the latest and have just upgraded Jupyter to 3.10-SNAPSHOT.