anaconda / ae5-tools

A command-line tool for scripting AE5 actions
https://www.anaconda.com/enterprise/
BSD 3-Clause "New" or "Revised" License
9 stars 8 forks source link

add namespace support #187

Closed mcg1969 closed 2 months ago

mcg1969 commented 4 months ago

Adds support for AE5 installed in non-default namespaces.

Mechanisms for supplying the namespace:

jlstevens commented 4 months ago

Looks good!

I am happy to do some testing with a cloud-based AE5 customer to try this out. Are there any other tasks you would like completed before review/merge?

mcg1969 commented 4 months ago

@jlstevens you are our best judge if the functionality that we need is preserved.

jlstevens commented 4 months ago

I was testing out this PR on a customer's cluster and tried ae5 node list which resulted in 500 Internal Server Error.

Checking the k8s deployment logs, I see that this PR was used (as Namespace supplied in file: /var/run/secrets/kubernetes.io/serviceaccount/namespace is printed) though that is then followed by the following error:

2024-05-13T14:08:44.183319099Z The project is ready to run commands.
2024-05-13T14:08:44.183366300Z Use `anaconda-project list-commands` to see what's available.
2024-05-13T14:08:44.223326046Z 35.72user 9.22system 2:43.48elapsed 27%CPU (0avgtext+0avgdata 551760maxresident)k
2024-05-13T14:08:44.223366746Z 383680inputs+473200outputs (17major+203122minor)pagefaults 0swaps
2024-05-13T14:08:44.362170450Z default is not a notebook command, not adding nb_runonly
2024-05-13T14:08:45.984111722Z $ /opt/continuum/anaconda/bin/conda info --json
2024-05-13T14:08:46.615838420Z $ /opt/continuum/anaconda/bin/conda env config vars list -p /opt/continuum/.conda/envs/k8s_server --json
2024-05-13T14:08:47.204928348Z $ /bin/sh -c python -m ae5_tools.k8s.server
2024-05-13T14:08:51.324600406Z API url default value used
2024-05-13T14:08:51.324641207Z API token supplied in file: /var/run/secrets/kubernetes.io/serviceaccount/token
2024-05-13T14:08:51.324872909Z Namespace supplied in file: /var/run/secrets/kubernetes.io/serviceaccount/namespace
2024-05-13T14:08:51.324924809Z API url: https://kubernetes.default/
2024-05-13T14:08:51.324947309Z API token: found
2024-05-13T14:08:51.324956509Z Namespace: anaconda-enterprise
2024-05-13T14:08:51.486757218Z ======== Running on http://0.0.0.0:8086 ========
2024-05-13T14:08:51.486801618Z (Press CTRL+C to quit)
2024-05-13T14:09:25.739792891Z Error handling request
2024-05-13T14:09:25.739821091Z Traceback (most recent call last):
2024-05-13T14:09:25.739826191Z   File "/opt/continuum/.conda/envs/k8s_server/lib/python3.11/site-packages/aiohttp/web_protocol.py", line 452, in _handle_request
2024-05-13T14:09:25.739829691Z     resp = await request_handler(request)
2024-05-13T14:09:25.739832791Z            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-05-13T14:09:25.739836891Z   File "/opt/continuum/.conda/envs/k8s_server/lib/python3.11/site-packages/aiohttp/web_app.py", line 543, in _handle
2024-05-13T14:09:25.739840391Z     resp = await handler(request)
2024-05-13T14:09:25.739843791Z            ^^^^^^^^^^^^^^^^^^^^^^
2024-05-13T14:09:25.739847891Z   File "/opt/continuum/project/ae5_tools/k8s/server.py", line 70, in nodeinfo
2024-05-13T14:09:25.739851292Z     result = await self.xfrm.node_info()
2024-05-13T14:09:25.739854392Z              ^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-05-13T14:09:25.739857892Z   File "/opt/continuum/project/ae5_tools/k8s/transformer.py", line 388, in node_info
2024-05-13T14:09:25.739860992Z     resp1, resp2, resp3 = await asyncio.gather(resp1, resp2, resp3)
2024-05-13T14:09:25.739864192Z                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-05-13T14:09:25.739867392Z   File "/opt/continuum/project/ae5_tools/k8s/transformer.py", line 232, in get
2024-05-13T14:09:25.739871192Z     resp.raise_for_status()
2024-05-13T14:09:25.739874492Z   File "/opt/continuum/.conda/envs/k8s_server/lib/python3.11/site-packages/aiohttp/client_reqrep.py", line 1070, in raise_for_status
2024-05-13T14:09:25.739877592Z     raise ClientResponseError(
2024-05-13T14:09:25.739880792Z aiohttp.client_exceptions.ClientResponseError: 403, message='Forbidden', url=URL('https://kubernetes.default/api/v1/nodes')
mcg1969 commented 4 months ago

This is a permissions issue. It would seem that the service account does not have the necessary permissions to hit that endpoint.

What should we do about this case? I agree a 500 is not the right result. Likely forwarding the 403 along is better

jlstevens commented 4 months ago

Ideally, I think ae5-tools should return a friendly message to the user explaining that the default credentials accessible to the k8s deployment has insufficient permissions and suggesting the appropriate fix (presumably asking for a k8s_token secret to be supplied with sufficiently elevated permissions?)

mcg1969 commented 2 months ago

@jlstevens @joshburt I think we should get this in. Further improvements can be separate tickets & PRs

jlstevens commented 2 months ago

Thanks for getting this in!

Would it be possible to cut a dev release soon that I can test with one of our customers?

mcg1969 commented 2 months ago

@jlstevens Josh is tagging 0.6.9