Shared-Reality-Lab / IMAGE-browser

IMAGE project browser extensions & client-side code
Other
2 stars 0 forks source link

Frequent very long delays in progress popup (over 30s in some cases) #247

Closed jeffbl closed 2 years ago

jeffbl commented 2 years ago

Install either publicly released extension or current build from 20220824 Brave or Chrome on Linux (cyan reports also getting this on Edge, Juliette getting delays as well, I assume in Chromium) Trigger IMAGE on boat photo. Also reproduced on street scene, so I don't think it is photo specific.

First time: progress bar pops up quickly, results come back after normal delay. Second time: tab consumes 100% cpu on one core, 10's of second of delay before progress popup appears. I've had delays greater than 30s. Note: Nothing shows up on server logs until progress popup appears, indicating this is an extension issue, not a server delay.

Expect: Consistently, the progress popup appears immediately after triggering IMAGE, and request is sent to server immediately afterwards.

jeffbl commented 2 years ago

May be related to https://github.com/Shared-Reality-Lab/IMAGE-browser/issues/246

jeffbl commented 2 years ago

More details. Pretty sure this is related to #246. In Chrome, I'm watching the CPU usage. When starting fresh, there is no CPU used by IMAGE tab. When making first request, it uses CPU (expected). However, after results come back, the tab is still at 100% CPU use for a while (unexpected). While it is at 100% CPU use, if I make another request, it doesn't pop up the progress dialog until the CPU drops. If I just wait before making another request, until the CPU drops to zero for the IMAGE tab, then the next request is instantaneous.

So, it appears that there is something blocking the extension after results are received, that eventually stops/times out. But until then, a new request is delayed, causing this issue. I suspect that this and #246 are the same root cause in the end?

mcgill-srl commented 2 years ago

Ok, just saw the 100% CPU on my ancient iMac. First try to get a rendering took ~3 s before "Image request sent" but subsequent requests are taking closer to 15 s. Killing and restarting the extension is now seemingly not sending the image request—I've waited several minutes with nothing happening.

Restarting Chrome, and the behaviour is as Jeff described: first request goes quickly, but subsequent requests take ~25-30 s. This does not seem to be correlated with CPU load.

jeffbl commented 2 years ago

I reproduced this against unicorn test server (unicorn) this morning, so it is not unique to production (pegasus)

jeffbl commented 2 years ago

I also reproduce it with the charts button (e.g., on etherscan.io homepage). First trigger is normal and very fast, second there is a long delay. This would point away from compression, since we wouldn't be compressing the chart like we do photos.

JRegimbal commented 2 years ago

I think I found the underlying cause @jaydeepsingh25. The issue is coming from the function called on line 307 of background.ts to clear from the port list when the port is killed. The call with indexOf takes several seconds because, for reasons I can only speculate on right now, the ports array contains hundreds of millions of entries. I suspect this is an initialization error. Anyway, it has to iterate through these entries of likely junk to find the one that matches the port to destroy. This is likely causing #246. And since this is synchronous with the main loop of the background script, it blocks until completion.

jaydeepsingh25 commented 2 years ago

Thanks for highlighting the issue @JRegimbal . Indeed we were storing port connections in an array, and for every tab we open the port entry was being stored in an array at the index of tabId. So for instance if the tab has id = 5465765, the port entry would be stored at the index number 5465765 whereas rest of the entries in the array are empty/undefined. This was leading to an array with millions of entries. I have changed the data structure, now these ports are being stored in a hashmap with tabId as the key. This is will eliminate empty entries and even the deletion will be in constant time.

JRegimbal commented 2 years ago

Great, thanks for fixing this. The troubles with test code staying in production! Were you able to find out why the array had so many empty/undefined entries?

jeffbl commented 2 years ago

@jaydeepsingh25 I believe this can be closed?

jeffbl commented 2 years ago

Going ahead and closing based on tests of 0.3.1 release.