storj / storagenode-docker

Auto-updated Storagenode container for Storj network
GNU Affero General Public License v3.0
2 stars 3 forks source link

docker container stopped on the update #15

Open mmihalko opened 1 year ago

mmihalko commented 1 year ago

I have repeating issue, that the container stops on update. It does not happen every update and happens randomly on different storagenode containers. For example last update to 1.78 two of my nodes stopped, on update to 1.79 different one (but only one updated yet). I have total 9 nodes, so the issue is pretty reliable.

Need to note, the "restart unless-stopped" directive for the container is set and confirmed by "docker inspect". I will run "docker inspect" on stopped container when it happens again.

I enclose tail of docker and node logs leading into stopped container.

I sent some logs and did a discussion on the forum. First post contains logs from node stopped on the update to 1.78.

I would like to note, I had these problems also on previous server. Only similarity I can think is, that both of them were running Debian, but not the same version.

Docker run command:

 docker run -d --restart unless-stopped --stop-timeout 300 --user $(id -u):$(id -g) \
     -p 3000$1:28967/tcp \
     -p 3000$1:28967/udp \
     -p 1400$1:14002 \
     -p 1410$1:14502 \
     -e WALLET=$(load_key "general" "wallet") \
     -e EMAIL=$(load_key "general" "email") \
     -e ADDRESS=$(load_snpar "van-ip"):3000$1 \
     -e STORAGE=$(load_snpar "size") \
     --sysctl net.ipv4.tcp_fastopen=3 \
     --mount type=bind,source=$(load_snpar "node-path")"/identity",destination=/app/identity \
     --mount type=bind,source=$(load_snpar "node-path")"/node",destination=/app/config \
     --name $SNNAME storjlabs/storagenode:latest \
     --operator.wallet-features=zksync \
     --storage2.piece-scan-on-startup=false \
     --log.output="/app/config/node.log" \
     --log.level=$(load_snpar "log-level")

docker.log node.log

storjrobot commented 1 year ago

This issue has been mentioned on Storj Community Forum (official). There might be relevant details there:

https://forum.storj.io/t/yet-another-node-offline-after-update-thread/22731/15

mmihalko commented 1 year ago

today, the issue happened again. Docker inspect of stopped container shows container has "unless-stopped" still enabled

"RestartPolicy": {
                "Name": "unless-stopped",
                "MaximumRetryCount": 0
            },