Closed nak3 closed 9 years ago
I've started noticing this in my environments too with Firefox.
If you enable INFO logging (https://github.com/openshift/origin/tree/master/assets#enable--disable-console-log-output), does anything jump out at you when the browser should be in a steady state after the initial page load?
@liggitt Sorry my late reply. I changed loglevel and watched the logs. The output was here [1].
This logs are ouptut when the browser process is not highload, but it is output higher cycle and continuously when it is highload.
[1] https://kenjiro.fedorapeople.org/tmp/ose3-highload-console.log
Wondering if this is related to token expiration with websockets. @nak3 / @brenton, do you know what your access token timeout is set to? If you hit $server/oauth/token/request, it should give you a token and tell you the expiration time.
@spadgett, if you want to recreate sitting on a project page with websockets reopening, with an expired token, you can do the following:
ReadTimeout
in pkg/cmd/server/origin/master.go
to something short like 30 * time.Second
. rebuild and rerun openshift start master
. This will make websocket connections timeout in 30 seconds instead of 5 minutes.Find the access token (localStorage["LocalStorageUserStore.token"]
) and delete it using osc:
$ osc delete oauthaccesstoken [token]
@liggitt
token time out is 86400
. I think it is default value. https://github.com/openshift/origin/blob/6773f17b61257b2ad9176b0fd3b08a4521793921/pkg/cmd/server/start/master_args.go#L270
{
"access_token": "YTg0NzgwOGYtYjJmMy00OWY5LWI5NzktYTg5ZjU3NTgyYzlh",
"expires_in": 86400,
"token_type": "bearer"
}
After 12 hours idling Firefox was oomkilled after chewing through 8GB so I think we're just leaking something. Leave chrome sitting for five minutes on a project overview and the heap utilization goes from 40mb to 70mb after collections.
I tried to follow https://developer.chrome.com/devtools/docs/javascript-memory-profiling#object-allocation-tracker but I got a bit lost trying to figure out what's creating them but I think we're leaking HTMLAnchorElement objects?
I've identified at least one significant memory leak. The hawtio-core-navigation
component we use for the left sidebar leaves JavaScript references to detached DOM nodes (as @sdodson notes, HTML anchor elements). These never get garbage collected and account for a significant amount of the memory usage of the web console. Thousands of these are created.
Here is the CPU profile after leaving the console idle for an hour.
This function in hawtio-core-navigation
appears to be the source of the leaked DOM nodes:
https://github.com/hawtio/hawtio-core-navigation/blob/master/dist/hawtio-core-navigation.js#L644
The temporary anchor element used to resolve relative URIs is never garbage collected.
I've modified hawio-core-navigation
locally. Reusing one shared anchor element instead of creating a new one at each function invocation causes the leak to go away. I'm still concerned about how often the isSelected
function is called, however.
We might have other memory leaks, but this looks like the most significant.
@spadgett can you pull in https://github.com/hawtio/hawtio-core-navigation/releases/tag/v2.0.48
I've discovered another issue where we're not properly closing web sockets when navigating between pages. This results in many web sockets being open for watches on each resource type and might explain some of the CPU load.
The latest hawtio-core-navigation
fixes the memory leak, but causes item.isSelected()
to be called constantly. This causes high CPU load anytime the Web Console is open.
The following three issues have been addressed:
We think these address the main causes.
Reopening... @spadgett found one more issue related to websockets getting reopening on failure
@liggitt Here is the error from the server that causes the client to reopen web sockets again and again.
{"type":"ERROR","object":{"kind":"Status","apiVersion":"v1beta3","metadata":{},"status":"Failure","message":"401: The event in requested index is outdated and cleared (the requested history has been cleared [1622/1013]) [2621]"}}
I saw this when I was changing the number of desired replicas.
Interestingly, I noticed the Cannot read property 'triggers' of undefined
error from #2725 in the console as well:
TypeError: Cannot read property 'triggers' of undefined
at associateDeploymentConfigTriggersToBuild (overview.js:286)
at overview.js:320
at Object.forEach (angular.js:331)
at Object.<anonymous> (overview.js:319)
at fire (jquery.js:3094)
at Object.self.fireWith (jquery.js:3206)
at Object.self.fire (jquery.js:3213)
at data.js:630
at Scope.$eval (angular.js:14383)
at Scope.$apply (angular.js:14482)
I wonder if it's related.
Ah yes. Our watch handler isn't dealing with ERROR responses. In that case we have to relist.
On Jun 6, 2015, at 10:16 AM, Sam Padgett notifications@github.com wrote:
@liggitt Here is the error from the server that causes the client to reopen web sockets again and again.
{"type":"ERROR","object":{"kind":"Status","apiVersion":"v1beta3","metadata":{},"status":"Failure","message":"401: The event in requested index is outdated and cleared (the requested history has been cleared [1622/1013]) [2621]"}}
I saw this when I was changing the number of desired replicas.
Interestingly, I noticed the Cannot read property 'triggers' of undefined error from #2725 in the console as well:
TypeError: Cannot read property 'triggers' of undefined at associateDeploymentConfigTriggersToBuild (overview.js:286) at overview.js:320 at Object.forEach (angular.js:331) at Object.
(overview.js:319) at fire (jquery.js:3094) at Object.self.fireWith (jquery.js:3206) at Object.self.fire (jquery.js:3213) at data.js:630 at Scope.$eval (angular.js:14383) at Scope.$apply (angular.js:14482) I wonder if it's related. — Reply to this email directly or view it on GitHub.
I'm working on a fix for the relist
Web Console memory and CPU usage are now dramatically lower.
this makes me so happy
Environment
Chorome
OpenShift
Reproduce steps
Access to the web console.
eg. https://ose3-master.example.com:8443/console/project/demo/browse/pods
Additional information