sigp / lighthouse

Ethereum consensus client in Rust
https://lighthouse.sigmaprime.io/
Apache License 2.0
2.88k stars 724 forks source link

Docker container exited with error code 135 out of the blue #4775

Open karalabe opened 12 months ago

karalabe commented 12 months ago

I'm half sure you have no way to meaningfully debug this at all, but still gonna report it if you see this elsewhere at some point. I was running a lighthouse beacon node for the last 2-3 months without issue. Very occasionally I restarted it, but otherwise it just worked.

Today... it just stopped. No error logs, no system failure logs, no nothing. Docker just reported that the process exited with error code 135. Boom.

Any clue what might have happened? I see no errors nor any logs on the docker host that it might have done something to kill the container. And teh machine is pretty much running an Ethereum setup and idling otherwise with insane resources, so it cannot be a memory or any other exhaustion thing.

michaelsproul commented 12 months ago

My only guess would be a stack overflow resulting in a SIGABRT. Code 135 from Docker seems to mean it was killed by some kind of signal. If it was a stack overflow the stderr should show it though 🤔

I also saw something about this happening with LMDB. I don't suppose you were running with the --slasher flag?

karalabe commented 12 months ago

Have't even heard about that flag. So no :D