pendulum-chain / spacewalk

Apache License 2.0
34 stars 7 forks source link

Investigate occasional high CPU usage of vault clients #512

Closed ebma closed 3 months ago

ebma commented 4 months ago

It seems like sometimes, the vault clients' CPU usage increases up to 100% without any apparent reason. This leads to the vault client becoming unresponsive and not handling bridge requests anymore.

What we know:

image image
ebma commented 4 months ago

@zoveress to facilitate the investigation of this ticket, can we change the log level of one of the Spacewalk clients to 'DEBUG'? Doesn't matter which one on Amplitude or Pendulum as they all seem to have the same issue at the same time (at least when it occurs for which we don't know the reason yet).

ebma commented 4 months ago

@pendulum-chain/product this is a bug that can lead to the vault client being unresponsive and only able to bridge after the next periodic restart. It does not occur frequently and we first have to monitor the instances when this is happening.

b-yap commented 4 months ago

From the #DevOps Incidents, it started as far back as Sept 17, 2023.

extra: Memory Usage High alerts happened on Sept 21, 2023

vadaynujra commented 4 months ago

@annatekl anything blocking this from being moved to Ready?

annatekl commented 4 months ago

moved @vadaynujra

ebma commented 4 months ago

@b-yap please don't forget to assign yourself to tickets once you start working on them so that others don't. 😬

b-yap commented 4 months ago

@ebma noted.

Question; as this has something to do with the vaults, should we move this ticket over there?

ebma commented 4 months ago

Over where? It's already in the spacewalk repository or would you prefer to have it elsewhere? 😅

b-yap commented 4 months ago

ah nevermind, I got confused with the zenhub "Pendulum" icon :weary: