SlideRuleEarth / sliderule

Server and client framework for on-demand science data processing in the cloud
https://slideruleearth.io
Other
26 stars 11 forks source link

Monitor HttpClient timeouts and force restart of container on reaching threshold #320

Open jpswinski opened 10 months ago

jpswinski commented 10 months ago

We see instances when a container gets stuck at 100% for one of its cores. After investigating, it appears that the HttpServer object is spinning and therefore unable to adequately handle requests. The effect of this is requests to the node timeout.

To mitigate this issue we could: