pblim commented 3 years ago

Describe the bug When I remove a pod, the pod list is not refreshed. To see recreated pod I have to click/change tab in UI

To Reproduce

click Pods tab
Remove Pod

Expected behavior The list automatically refreshes to show the new pod creation

Screenshots Screenshot 2021-09-16 092146

Environment (please complete the following information):

Lens Version: 5.2.0-latest.20210908
OS: Windows10
Installation method (e.g. snap or AppImage in Linux): Lens Setup version: 5.2.0-latest.20210908

Nokel81 commented 3 years ago

We have tried to fix this over the last few patch releases, does 5.2.4 resolve this for you?

gonzalezjp commented 3 years ago

Lens: 5.2.4-latest.20210923.1 does not resolve it, still is not auto-refreshing

pblim commented 3 years ago

We have tried to fix this over the last few patch releases, does 5.2.4 resolve this for you?

It does not. Still is not auto-refreshing

pblim commented 3 years ago

The latest version where it works for me is 4.1.0

Nokel81 commented 3 years ago

Do you have "watch" permissions?

pblim commented 3 years ago

Yes, I do. I have a cluster-owner role.

Nokel81 commented 2 years ago

Would we be able to get some more logs from the devtools?

zswanson commented 2 years ago

Same in 5.3.4, refresh has been broken ever since I upgraded past 4.1.0

I can see in the devtool that it sets up Watches on pods etc but we still see 'stale' info frequently.

Nokel81 commented 2 years ago

@zswanson What kubernetes setup are you using?

zswanson commented 2 years ago

AWS EKS 1.21, private subnets for the cluster API and SSO'd authentication

pblim commented 2 years ago

I moved to MacOS and installed the latest version but it still does not work. I installed 4.1.1 and works. All my colleagues are facing this issue. My entire team is using version 4.1.1.

AWS EKS 1.19

Nokel81 commented 2 years ago

@pblim Do you know what is the latest version that works for you and your team is? So that we can try and bisect the issue. For instance, does any of the 4.2 series work for you?

alanbchristie commented 2 years ago

I thought I'd contribute my experiences here.

The ability of Lens to automatically refresh the view of the world broke a long time ago but, sadly, I've just "lived with it". Partly because I'm busy on lots of other things and couldn't spend enough time to describe the issue, and partly because I thought "someone else must be seeing this". Recent efforts debugging multiple dynamic objects brought the problem into sharp focus again - I find it just unusable and have to constantly refresh everything by navigating back and forth (see below). Now I'm starting to use k9s which doesn't appear to suffer from any update issues.

Essentially Lens does not appear to track any object changes. That's for me, and my colleagues who also use it on the same clusters. It won't spot new Namespaces, Pods, pretty much anything. Instead you have to keep switching between between object types or, on rare occasions switch between clusters or just restart Lens.

For example ... if you want an up-to-date (accurate) view of Pods, navigate away to, say Deployments, and then click back to view the Pods. Then your view is a little better. But you have to keep doing this, once the Pod is gone you have to navigate back and forth to see that it's gone.

I don't know when it broke so hesitate to identify a specific revision. For me it broke a long time ago. Maybe post-v4.1.1 as others have reported?

For many recent revisions it just won't track objects.

I'm using: -

Lens: 5.4.5-latest.20220405.1
Electron: 14.2.4
Chrome: 93.0.4577.82
Node: 14.17.0

On Mac OS 11.6.5 (although I doubt the fact I'm using Mac has anything to do with the problem) because my colleagues use other variants of Unix.

One hint that it's likely to be a Lens-specific issue is that when the same kubeconfig is used for k9s, k9s has no trouble updating its view of the world. When new Pods appear in the cluster, they appear in k9s. When they disappear from the cluster they disappear from k9s.

Sadly I've started to use k9s more frequently now, as the inability of Lens to update sort of makes it unfit for purpose for me.

If you need any logs or diagnostics just ask - I just assumed more people must see this and someone would be looking at it.

If it's any consolation - I still use Lens - but only on clusters/Namespaces where I know objects are not being created or destroyed.

leenamba commented 2 years ago

Thank you all for raising this issue and providing debugging information. This is core functionality and we have put the highest priority on fixing this issue!

Nokel81 commented 2 years ago

@alanbchristie Hello, would we be able to get some more information about your setup to help us diagnose the problem?

I believe that you are using AWS EKS, what version are you using?
Are you just syncing the kubeconfig to add the clusters?
If you open up the dev tools' console, do you see any errors?
How big is your clusters (how many namespaces/nodes/pods do you have)?

alanbchristie commented 2 years ago

Versions We are using EKS and several on-prem OpenStack-based clusters that were deployed using Rancher.

The EKS Kubernetes version is 1.19.15
The OpenStack clusters are also 1.19.15

Kubeconfig We have KUBECONFIG files for each cluster that are pasted into Lens to view our clusters, i.e using Add from kubeconfig

Console Errors When I hit Views -> Toggle Developer Tools and I filter errors on the console, nothing is listed at the moment.

Sizing

Clusters are small, typically less than 8 Nodes
Namespaces vary but typically no more than 42 (including all the standard namespaces)
Busiest cluster runs around 170 Pods (that's all Pods)

alanbchristie commented 2 years ago

I will deploy some changes later and see what's reported in the lens console as Pods arrive.

Nokel81 commented 2 years ago

Thanks for the information.

alanbchristie commented 2 years ago

So here's one namespace with two Pods in it from two Deployments. This is what I expect to see at the point I navigate to the Pods screen, and Lens is correct: -

Now ... from my client I run an Ansible playbook that essentially deploys a bunch of new StatefulSets and Deployments. What we should see are nine new Pods. But, no matter how long I wait, the screen does not change and there are no errors in the console display. I know 9 new Pods are running, but Lens doesn't!

To reveal the missing Pods I simply > click on Deployments < in the side-bar to navigate away from the Pods and then navigate back to the Pods screen by > clicking on Pods < in the side-bar.

And, hey presto, there they are: -

The playbook finishes after a few minutes and part of its role is to remove these nine Pods. But, again, no matter how long I wait Lens still thinks there are eleven Pods running. This is not true. If I click away from Pods and return to Pods I see the correct view ... i.e. only 2 Pods again.

Lens just cannot show me objects that are changing - it only shows me what's there at the point I choose to view things.

It's something that I can repeat again and again, as can the rest of the team, so what can we do to understand why?

Mac OS 11.6.5
Lens 5.4.5-latest.20220405.1

alanbchristie commented 2 years ago

...and remember that k9s, on the same client, armed with the same KUBECONFIG file, has no trouble refreshing its view of the world.

Nokel81 commented 2 years ago

and remember that k9s, on the same client, armed with the same KUBECONFIG file, has no trouble refreshing its view of the world.

Yes we know, we are trying to figure out why you are see this as we cannot reproduce it yet.

aleksfront commented 2 years ago

@alanbchristie Can you please remove Error filtering from DevTools and show what is hidden on the screenshot. Most interesting parts start with [KUBE-API] and [KUBE-WATCH-API]. For some reason watch requests are not working as expected for your case, but "regular" ones doing okay.

You should see something like this kube watch log

Also, do you have some proxy enabled in Cluster Settings?

alanbchristie commented 2 years ago

Last thing first - there's no Proxy

Now for a re-run of the refresh problem, with screenshots of each stage.

I start with the Namespace in its "steady state", i.e. with just two Pods running. The console filters were cleared so there's nothing new to see, but here's Lens at the start of the test, with a correct view of the world...

Now I run the Ansible playbook on the client, which deploys lots of things but, for our purposes, it results in 9 new running Pods. I watch this via "the other tool" and wait until things settle down. In the meantime Lens is doing nothing, its screen does not change at all. Screenshot follows, just to illustrate the fact...

Now, to force a refresh, I navigate to the Deployments screen...

...and navigate back to Pods...

After 30 minutes (when testing is complete) everything is cleaned up and returns to the "steady state". I know this has happened but Lens has not updated. Here's Lens after I know that the new Pods have gone: -

To refresh the Pod view I navigate to Deployments: -

And then back to Pods: -

jim-docker commented 2 years ago

@alanbchristie thanks for all the details. Could you try launching Lens from the terminal and see if there are any error logs there?

DEBUG=TRUE /Applications/Lens.app/Contents/MacOS/Lens

There may be [KUBE-API] watch logs

alanbchristie commented 2 years ago

Should I expect to see things written to stdout or do I look for logs somewhere? All I get is some brief stdout when the app starts...

alan$ DEBUG=TRUE /Applications/Lens.app/Contents/MacOS/Lens
info:    ▪ 📟 Setting Lens as protocol client for lens:// +0ms
info:    ▪ 📟 Protocol client register succeeded ✅ +3ms
debug:   ▪ [APP-MAIN] configuring packages +0ms
info:    ▪ [EXTENSIONS-STORE]: LOADING from  ... +5ms
info:    ▪ [EXTENSIONS-STORE]: LOADED from /Users/alan/Library/Application Support/Lens/lens-extensions.json +7ms
info:    ▪ [FILESYSTEM-PROVISIONER-STORE]: LOADING from  ... +2ms
info:    ▪ [FILESYSTEM-PROVISIONER-STORE]: LOADED from /Users/alan/Library/Application Support/Lens/lens-filesystem-provisioner-store.json +2ms
debug:   ▪ [APP-MAIN] initializing ipc main handlers +2ms
debug:   ▪ [APP-MAIN] Lens protocol routing main +0ms

jim-docker commented 2 years ago

The logs do get written elsewhere as well but stdout is good for "real-time" observation. Also, I see you're logged into Lens Spaces. Do you need to be to access the cluster? (Can you see if the problem persists when you're logged out of Lens Spaces--I expect it would, but nice to know for sure)

alanbchristie commented 2 years ago

I'll try while out of "spaces" but the issue's been there for a long time - long before I created a spaces account so I think that will be fruitless. For now I'll just re-run the same tests using your DEBUG suggestion.

jim-docker commented 2 years ago

Should I expect to see things written to stdout or do I look for logs somewhere? All I get is some brief stdout when the app starts...

alan$ DEBUG=TRUE /Applications/Lens.app/Contents/MacOS/Lens
info:    ▪ 📟 Setting Lens as protocol client for lens:// +0ms
info:    ▪ 📟 Protocol client register succeeded ✅ +3ms
debug:   ▪ [APP-MAIN] configuring packages +0ms
info:    ▪ [EXTENSIONS-STORE]: LOADING from  ... +5ms
info:    ▪ [EXTENSIONS-STORE]: LOADED from /Users/alan/Library/Application Support/Lens/lens-extensions.json +7ms
info:    ▪ [FILESYSTEM-PROVISIONER-STORE]: LOADING from  ... +2ms
info:    ▪ [FILESYSTEM-PROVISIONER-STORE]: LOADED from /Users/alan/Library/Application Support/Lens/lens-filesystem-provisioner-store.json +2ms
debug:   ▪ [APP-MAIN] initializing ipc main handlers +2ms
debug:   ▪ [APP-MAIN] Lens protocol routing main +0ms

Is that log snippet all you see in the terminal? It should keep logging as you do things in Lens

alanbchristie commented 2 years ago

Nope - that's it - it doesn't matter what I do in Lens there's no more stdout.

Lens: 5.4.5-latest.20220405.1
Electron: 14.2.4
Chrome: 93.0.4577.82
Node: 14.17.0
© 2021 Mirantis, Inc.

macOS Big Sur Version 11.6.5 (20G527)

So is there any point in running the test again?

jim-docker commented 2 years ago

If you've tried connecting/disconnecting a cluster and there's no additional logging in the terminal then probably no point. But this might be a clue...

alanbchristie commented 2 years ago

Incidentally, it's brief but I caught this WARNING message when connecting to the cluster. I'm not sure how important this is because exactly the same KUBECONFIG is used in other tools that have no problems. Nevertheless here's the brief message...

W0414 16:49:26.592699 17680 proxy.go:170] Your kube context contains a server path /k8s/clusters/c-7xtkg, use --append-server-path to automatically append the path to each request

jim-docker commented 2 years ago

Thanks. I think your problem with launching Lens from the terminal is because Lens was already running (It doesn't start another instance, it just switches to the running instance). From the Lens menu pick Quit (or Quit App from the tray icon) and then try starting Lens again from the terminal.

alanbchristie commented 2 years ago

That's better - a heck of a lot more happening now. By coincidence my CI job has also started so the namespace is busy - I'll watch and see what appears in the log as Pods come and go.

Lens is already clearly 'out of date' in the UI but all I'm seeing is repetitive blocks like this on stdout...

info:    ┏ [CLUSTER]: refresh +30s
info:    ┃ [1] {
info:    ┃ [2]   id: '5c2fc572bc02c17863b75df66b97e5fe',
info:    ┃ [3]   name: 'xch',
info:    ┃ [4]   ready: true,
info:    ┃ [5]   online: true,
info:    ┃ [6]   accessible: true,
info:    ┃ [7]   disconnected: false
info:    ┗ [8] }
debug:   ▪ [CLUSTER-MANAGER]: updating catalog from cluster store +282ms
debug:   ▪ [CLUSTER-MANAGER]: updating catalog from cluster store +12ms

alanbchristie commented 2 years ago

Yep - pretty much all of the Pods have gone now but Lens is stuck thinking nothing's changed. There's nothing of particular interest in the UI console screen and stdout is just issuing the messages you see above every 30 seconds. Here's the latest...

info:    ┏ [CLUSTER]: refresh +30s
info:    ┃ [1] {
info:    ┃ [2]   id: '5c2fc572bc02c17863b75df66b97e5fe',
info:    ┃ [3]   name: 'xch',
info:    ┃ [4]   ready: true,
info:    ┃ [5]   online: true,
info:    ┃ [6]   accessible: true,
info:    ┃ [7]   disconnected: false
info:    ┗ [8] }
debug:   ▪ [CLUSTER-MANAGER]: updating catalog from cluster store +300ms
debug:   ▪ [CLUSTER-MANAGER]: updating catalog from cluster store +11ms

alanbchristie commented 2 years ago

I do the navigate to Deployments and back to Pods "trick" and the view is up-to-date again. Nothing in the stdout log other than what I've already posted and nothing of any significance in the UI console.

It's almost as if Lens is asking for watch events but not being given any.

More than happy to screen-share on Zoom/Discord if someone wants to see this thing for themselves but otherwise I see no errors or warnings. There's no update beyond manually refreshing the views.

alanbchristie commented 2 years ago

As a sanity check I construct a simple Python app to watch the events in the namespace we've been looking using the same kubeconfig file: -

from datetime import datetime

import kubernetes

kubernetes.config.load_kube_config()

watch = kubernetes.watch.Watch()
core_v1 = kubernetes.client.CoreV1Api()
for event in watch.stream(func=core_v1.list_namespaced_pod,
                          namespace='data-manager-api-integration'):
    e_type = event['type']
    e_kind = event['object'].kind
    e_name = event['object'].metadata.name
    e_rv = event['object'].metadata.resource_version
    now = datetime.now().isoformat()
    print(f'+ {now} event type={e_type} kind={e_kind} name={e_name} rv={e_rv}')

I run it on the Mac...

$ pip install kubernetes==23.3.0
$ export KUBECONFIG=<blah>
$ python kwatcher.py

...and then, after a minute I start the orchestration of the Pods. All the watch events I expect get delivered and printed. The first two events represent the two "steady state" Pods in the namespace. You can then see the database, dm-cmb-0, dm-api-0, and pbc, kew and mon Pods and their resource versions along with everything else...

+ 2022-04-14T20:50:33.518984 event type=ADDED kind=Pod name=jobs-operator-6c6ff786f7-lpjmd rv=375360263
+ 2022-04-14T20:50:33.541761 event type=ADDED kind=Pod name=jupyternotebooks-operator-548b68c44-5j6fd rv=375185056
+ 2022-04-14T20:51:03.061320 event type=ADDED kind=Pod name=database-0 rv=378173538
+ 2022-04-14T20:51:03.108329 event type=MODIFIED kind=Pod name=database-0 rv=378173540
+ 2022-04-14T20:51:03.112361 event type=MODIFIED kind=Pod name=database-0 rv=378173545
+ 2022-04-14T20:51:20.612849 event type=MODIFIED kind=Pod name=database-0 rv=378173710
+ 2022-04-14T20:51:21.144002 event type=MODIFIED kind=Pod name=database-0 rv=378173719
+ 2022-04-14T20:51:53.690954 event type=MODIFIED kind=Pod name=database-0 rv=378173950
+ 2022-04-14T20:52:03.726490 event type=ADDED kind=Pod name=dm-cmb-0 rv=378174094
+ 2022-04-14T20:52:03.729253 event type=MODIFIED kind=Pod name=dm-cmb-0 rv=378174096
+ 2022-04-14T20:52:03.904850 event type=MODIFIED kind=Pod name=dm-cmb-0 rv=378174104
+ 2022-04-14T20:52:20.700881 event type=MODIFIED kind=Pod name=dm-cmb-0 rv=378174244
+ 2022-04-14T20:52:21.003605 event type=MODIFIED kind=Pod name=dm-cmb-0 rv=378174248
+ 2022-04-14T20:53:03.014771 event type=MODIFIED kind=Pod name=dm-cmb-0 rv=378174547
+ 2022-04-14T20:53:15.712492 event type=ADDED kind=Pod name=dm-api-0 rv=378174636
+ 2022-04-14T20:53:15.751596 event type=MODIFIED kind=Pod name=dm-api-0 rv=378174638
+ 2022-04-14T20:53:15.784912 event type=MODIFIED kind=Pod name=dm-api-0 rv=378174643
+ 2022-04-14T20:53:18.257784 event type=ADDED kind=Pod name=dm-pbc-5b74fd45f9-nbqxk rv=378174742
+ 2022-04-14T20:53:18.302047 event type=MODIFIED kind=Pod name=dm-pbc-5b74fd45f9-nbqxk rv=378174743
+ 2022-04-14T20:53:18.304780 event type=MODIFIED kind=Pod name=dm-pbc-5b74fd45f9-nbqxk rv=378174748
+ 2022-04-14T20:53:19.193683 event type=MODIFIED kind=Pod name=dm-pbc-5b74fd45f9-nbqxk rv=378174760
+ 2022-04-14T20:53:19.195622 event type=ADDED kind=Pod name=dm-kew-5c77bd595c-5zsqs rv=378174763
+ 2022-04-14T20:53:19.269396 event type=MODIFIED kind=Pod name=dm-kew-5c77bd595c-5zsqs rv=378174765
+ 2022-04-14T20:53:19.272257 event type=MODIFIED kind=Pod name=dm-kew-5c77bd595c-5zsqs rv=378174771
+ 2022-04-14T20:53:20.090714 event type=ADDED kind=Pod name=dm-mon-744b57c54c-ntt7b rv=378174784
+ 2022-04-14T20:53:20.141049 event type=MODIFIED kind=Pod name=dm-mon-744b57c54c-ntt7b rv=378174788
+ 2022-04-14T20:53:20.143935 event type=MODIFIED kind=Pod name=dm-mon-744b57c54c-ntt7b rv=378174792
+ 2022-04-14T20:53:20.997196 event type=ADDED kind=Pod name=dm-ctw-0 rv=378174810
+ 2022-04-14T20:53:20.999373 event type=MODIFIED kind=Pod name=dm-ctw-0 rv=378174815
+ 2022-04-14T20:53:21.155873 event type=MODIFIED kind=Pod name=dm-ctw-0 rv=378174817
+ 2022-04-14T20:53:21.937647 event type=MODIFIED kind=Pod name=dm-pbc-5b74fd45f9-nbqxk rv=378174827
+ 2022-04-14T20:53:32.902874 event type=MODIFIED kind=Pod name=dm-kew-5c77bd595c-5zsqs rv=378174977
+ 2022-04-14T20:53:34.863343 event type=MODIFIED kind=Pod name=dm-kew-5c77bd595c-5zsqs rv=378174994
+ 2022-04-14T20:53:49.634750 event type=MODIFIED kind=Pod name=dm-ctw-0 rv=378175150
+ 2022-04-14T20:53:49.790788 event type=MODIFIED kind=Pod name=dm-api-0 rv=378175153
+ 2022-04-14T20:53:50.131768 event type=MODIFIED kind=Pod name=dm-mon-744b57c54c-ntt7b rv=378175155
+ 2022-04-14T20:53:51.038067 event type=MODIFIED kind=Pod name=dm-ctw-0 rv=378175168
+ 2022-04-14T20:53:52.067808 event type=MODIFIED kind=Pod name=dm-api-0 rv=378175180
+ 2022-04-14T20:53:53.297861 event type=MODIFIED kind=Pod name=dm-mon-744b57c54c-ntt7b rv=378175197
+ 2022-04-14T20:53:53.627720 event type=MODIFIED kind=Pod name=dm-api-0 rv=378175203
+ 2022-04-14T20:53:55.458893 event type=MODIFIED kind=Pod name=dm-api-0 rv=378175217
+ 2022-04-14T20:54:08.547308 event type=MODIFIED kind=Pod name=dm-api-0 rv=378175321
+ 2022-04-14T20:54:09.609108 event type=MODIFIED kind=Pod name=dm-mon-744b57c54c-ntt7b rv=378175337
+ 2022-04-14T20:54:09.613423 event type=MODIFIED kind=Pod name=dm-pbc-5b74fd45f9-nbqxk rv=378175341
+ 2022-04-14T20:54:09.959150 event type=MODIFIED kind=Pod name=dm-kew-5c77bd595c-5zsqs rv=378175343
+ 2022-04-14T20:54:10.756645 event type=MODIFIED kind=Pod name=dm-ctw-0 rv=378175348
+ 2022-04-14T20:54:12.560585 event type=MODIFIED kind=Pod name=dm-mon-744b57c54c-ntt7b rv=378175370
+ 2022-04-14T20:54:13.755376 event type=MODIFIED kind=Pod name=dm-pbc-5b74fd45f9-nbqxk rv=378175390
+ 2022-04-14T20:54:14.356926 event type=MODIFIED kind=Pod name=dm-kew-5c77bd595c-5zsqs rv=378175399
+ 2022-04-14T20:54:15.161868 event type=MODIFIED kind=Pod name=dm-ctw-0 rv=378175408
+ 2022-04-14T20:54:15.236779 event type=ADDED kind=Pod name=dm-ctw-1 rv=378175411
+ 2022-04-14T20:54:15.238928 event type=MODIFIED kind=Pod name=dm-ctw-1 rv=378175413
+ 2022-04-14T20:54:16.155258 event type=MODIFIED kind=Pod name=dm-ctw-1 rv=378175425
+ 2022-04-14T20:54:19.112986 event type=MODIFIED kind=Pod name=dm-ctw-1 rv=378175445
+ 2022-04-14T20:54:21.053346 event type=MODIFIED kind=Pod name=dm-ctw-1 rv=378175465
+ 2022-04-14T20:54:23.355246 event type=MODIFIED kind=Pod name=dm-ctw-1 rv=378175492

alanbchristie commented 2 years ago

It all stops in 4.1.4

On my laptop (macOS 12.3.1), which was also running with the broken 5.4.5-latest.20220405.1, I decided to return to an earlier version and see where auto-refresh breaks. Initial evidence, from earlier posts in this issue pointed at somewhere around 4.1.1: -

4.1.1 [ok]
4.1.2 [ok]
4.1.3 [ok]
4.1.4 [BROKEN]

Here's the splash-screen for 4.1.4..

aleksfront commented 2 years ago

While browsing 4.1.4 Release changes I found that 2 PRs can potentially cause an issue (However, I don't see clear relation still):

Fix: preventing to render on cluster refresh (https://github.com/lensapp/lens/pull/2253)
Flush response headers always when proxy gets a response (https://github.com/lensapp/lens/pull/2229)

Since #2229 touches watch requests, I've created new branch (based on 5.4.5) without following line

if (url.searchParams.has("watch")) res.flushHeaders();

Can you please try this steps to see if problem persist?

Clone OpenLens repository with branch fix-remove-watch-flush-headers https://github.com/lensapp/lens/tree/fix-remove-watch-flush-headers.
Execute commands yarn && yarn build:mac.
Go to the ./dist/mac folder and start OpenLens.app.

alanbchristie commented 2 years ago

Even though I have Xcode command-line tools I'm seeing...

gyp: No Xcode or CLT version detected

I'm following guidance to re-install the CLT but it's a big package and will take some time. Annoying. I'll get this version tested when I can build it.

alanbchristie commented 2 years ago

Built but lens doesn't connect to the cluster. When I configure my cluster using the KUBECONFIG file I've been using up to now lens fails with an ENOENT error, expecting the directory [...]/OpenLens.app/Contents/Resources/x64/lens-k8s-proxy. There is no x64 in the Resources directory: -

When I hit Reconnect I get: -

alanbchristie commented 2 years ago

So I add an empty copy of there file it's expecting...

$ touch [...]/x64/lens-k8s-prroxy
$ chmod a+wx [...]/x64/lens-k8s-prroxy

This gets me past the error but lens now locks up connecting to the cluster with a pretty spinning colour changing wheel...

Q. Would there be any value in patching the original broken 4.1.4 release, rather than the latest codebase? Just to test the hypothesis that's it's this modification?

alanbchristie commented 2 years ago

Sadly, having forked the repo and having checked out the 4.1.4 commit, the build fails trying to get the SHASUMS256.txt file, with a 502 error from the atom server. Can 4.1.4 be built any more? i.e. are the electron 9.1.0 release files intact?

Here's the yarn/build error..

gyp info using node-gyp@5.1.0
gyp info using node@12.22.12 | darwin | x64
gyp info find Python using Python version 3.10.2 found at "/usr/local/opt/python@3.10/bin/python3.10"
gyp http GET https://atom.io/download/electron/v9.1.0/node-v9.1.0-headers.tar.gz
gyp http 200 https://atom.io/download/electron/v9.1.0/node-v9.1.0-headers.tar.gz
gyp http GET https://atom.io/download/electron/v9.1.0/SHASUMS256.txt
gyp http 502 https://atom.io/download/electron/v9.1.0/SHASUMS256.txt
gyp WARN install got an error, rolling back install
gyp ERR! configure error 
gyp ERR! stack Error: 502 status code downloading checksum
gyp ERR! stack     at Request.<anonymous> (/usr/local/Cellar/node@12/12.22.12/lib/node_modules/npm/node_modules/node-gyp/lib/install.js:273:18)
gyp ERR! stack     at Request.emit (events.js:326:22)
gyp ERR! stack     at Request.onRequestResponse (/usr/local/Cellar/node@12/12.22.12/lib/node_modules/npm/node_modules/request/request.js:1066:10)
gyp ERR! stack     at ClientRequest.emit (events.js:314:20)
gyp ERR! stack     at HTTPParser.parserOnIncomingClient (_http_client.js:601:27)
gyp ERR! stack     at HTTPParser.parserOnHeadersComplete (_http_common.js:122:17)
gyp ERR! stack     at TLSSocket.socketOnData (_http_client.js:474:22)
gyp ERR! stack     at TLSSocket.emit (events.js:314:20)
gyp ERR! stack     at addChunk (_stream_readable.js:297:12)
gyp ERR! stack     at readableAddChunk (_stream_readable.js:272:9)
gyp ERR! System Darwin 21.4.0
gyp ERR! command "/usr/local/Cellar/node@12/12.22.12/bin/node" "/usr/local/Cellar/node@12/12.22.12/lib/node_modules/npm/node_modules/node-gyp/bin/node-gyp.js" "rebuild"
gyp ERR! cwd /Users/abc/Code/Personal-GitHub/lens/node_modules/node-pty
gyp ERR! node -v v12.22.12

When you think you have something that should work, let me know.

Nokel81 commented 2 years ago

@alanbchristie I see that error in our CI sometimes. If you try again today it should be resolved.

jim-docker commented 2 years ago

@alanbchristie I don't know if this would be an option for you, but if you have a toy/test cluster that exhibits the issue, which you could give us temporary access to, it might be easier for us to debug.

alanbchristie commented 2 years ago

Thanks - re-running the build command enabled me to get to the binary.

So ...

I can confirm...

Refresh is clearly broken in 4.1.4
Refresh appears to be FIXED if you undo PR-2229

When I say fixed, I mean the majority of Pod phase changes now get rendered automatically. It still got stuck on on Pod that has 3 containers - i.e. one container was left as amber (initialising) when it had in fact reached a running phase (as confirmed by navigating away and back). When I killed the Pod a second time (to see lens refresh again) it did update all the container status correctly. But clearly we've probably found the cause of the break.

Note: I also cannot confirm that all watches are working - I'm just looking at Pods here. But this is a significant improvement.

Unless there's any doubt about what I've done to 4.1.4 this is my createProxy(): -

  protected createProxy(): httpProxy {
    const proxy = httpProxy.createProxyServer();

    proxy.on("proxyRes", (proxyRes, req) => {
      const retryCounterId = this.getRequestId(req);

      if (this.retryCounters.has(retryCounterId)) {
        this.retryCounters.delete(retryCounterId);
      }

      // if (!res.headersSent && req.url) {
      //   const url = new URL(req.url, "http://localhost");
      //
      //   if (url.searchParams.has("watch")) res.flushHeaders();
      // }
    });

jim-docker commented 2 years ago

This code was added to fix a terminal hang (actually just a long timeout) when running kubectl rollout status... (#1988). Checking for "watch" in the url search params is too broad and clearly can clobber other watches. Doing this in the Lens terminal:

kubectl rollout status -n default deployment nginx-deployment

results in:

[0] info:    ▪ [LENS-PROXY]: flushing headers for http://localhost/apis/apps/v1/namespaces/default/deployments?allowWatchBookmarks=true&fieldSelector=metadata.name%3Dnginx-deployment&resourceVersion=915066&timeoutSeconds=572&watch=true +16s

We could require other matches from this url to limit the scope of the flushHeaders() call.

Though we should probably instead figure out why kubectl rollout status waits for the timeout in the Lens terminal (as opposed to exiting immediately upon the rolllout finishing in a system terminal)

alanbchristie commented 2 years ago

Heads up ... moving to 4.1.3 doesn't fix everything with auto-refresh.

I should add that, having switched form the v5 release to 4.1.3, I've found that this isn't perfect either. For example I had a Pod this morning whose Deployment was being removed. The Pod clearly went through to its Terminating state. It entered an Error state (not unexpected for a Pod being removed) but it clearly got stuck in the Error state.

I knew the Pod had gone so I navigated back to Deployments and back to Pods and the Pod was no longer listed.

So, if refresh has been working, we have to go back to an earlier release.

MartinGolding515 commented 2 years ago

I've found that windows refresh if lens is closed or reopened, or slightly quicker is to change the namespaces drop down. Would still like another solution, even if only a refresh button.

Nokel81 commented 2 years ago

Heads up ... 4.1.4 doesn't fix everything with auto-refresh.

I thought you implied that 4.1.3 would fix auto-refresh

alanbchristie commented 2 years ago

Sorry - typo - I'll edit the posts - In my limited testing (of 4.1.3) I think it does fix things - but over the last few days I noticed that (occasionally) it misses some things. For the majority of times it's working in 4.1.3. It only occasionally misses some events.

To clarify, I've now switched to using...

Lens: 4.1.3
Electron: 9.4.0
Chrome: 83.0.4103.122
Node: 12.14.1

...which is significantly more reliable with regard to real-time Pod states.

lensapp / lens

Auto-refresh is not working #3821

It all stops in 4.1.4