input-output-hk / mithril

Stake-based threshold multi-signatures protocol
https://mithril.network
Apache License 2.0
123 stars 39 forks source link

Resource exhausted on Cardano node socket #1803

Closed jpraynaud closed 2 months ago

jpraynaud commented 3 months ago

Why

The aggregator of the release-mainnet network stopped producing certificates and missed in particular the production of the Cardano db snapshot for immutable file number 5917. After investigation, it appears that the aggregator stopped operating properly as it was unable to retrieve critical information (Tip of the chain) with the local state query mini-protocol: a problem occurred during the connection to the mini-protocol with the error message Resource temporarily unavailable (os error 11). We were able to reproduce the problem with the Cardano cli and received the same error: <socket: 11>: resource exhausted (Resource temporarily unavailable)

A restart of the Cardano node was sufficient to reactive the local socket and the aggregator resumed its operations swiftly after the restart.

The Cardano node (8.9.0) was up and running for 3 months, and we have never noticed that problem earlier.

What

Investigate possible cause for the resource exhaustion on the socket.

How

jpraynaud commented 3 months ago

Hi @falcucci @scarmuega is this a known issue with Pallas?

scarmuega commented 3 months ago

@jpraynaud no, it isn't.

For what I read in the issue, if the error was reproducable through the cardano-cli, this is likely to be an issue with the node instance.

jpraynaud commented 3 months ago

@jpraynaud no, it isn't.

For what I read in the issue, if the error was reproducable through the cardano-cli, this is likely to be an issue with the node instance.

Thanks for your answer @scarmuega!

I guess the problem is occurring in one of the routes of the REST API of the aggregator instead.

jpraynaud commented 2 months ago

Closed with #1804