vdsm / virtual-dsm

Virtual DSM in a Docker container.
MIT License
2.5k stars 333 forks source link

Weird shutdowns coming out of nowhere #797

Open carloscz opened 1 month ago

carloscz commented 1 month ago

Operating system

Debian 12 inside Proxmox LXC

Description

After running completely fine for some weeks ( after unsuccessful snapshot backup ) under load ( 2TB backup running from another Synology box ) container just shuts down, tried re-installing system and porting just data partitions but didn't help. I would really appreciate help with where the shutdown could be coming from :)

Docker compose

docker run -d --name=new_gracious_murdock --hostname=1b7a2dc5b318 --mac-address=02:42:ac:11:00:02 --env=DISK_SIZE=32G --env=DISK2_SIZE=8T --env=DISK_FMT=qcow2 --env=ALLOCATE=N --env=CPU_CORES=2 --env=RAM_SIZE=4096M --volume=/vdsm/storage2:/storage2 --volume=/vdsm/storage1:/storage --cap-add=NET_ADMIN --network=bridge --expose=139 --expose=22 --expose=445 -p 5000:5000 -p 5001:5001 --restart=unless-stopped --device /dev/kvm:/dev/kvm --device /dev/net/tun:/dev/net/tun --device /dev/vhost-net:/dev/vhost-net --runtime=runc vdsm/virtual-dsm:latest

Docker log

❯ ----------------------------------------------------------- ❯ You can now login to DSM at port 5000 ❯ -----------------------------------------------------------

vdsm login: [ 24.794750] fuse init (API version 7.23) [ 25.087103] findhostd uses obsolete (PF_INET,SOCK_PACKET) [ 25.407548] ip_tables: (C) 2000-2006 Netfilter Core Team [ 25.413299] audit: type=1325 audit(1723643481.405:2): table=filter family=2 entries=0 [ 25.418900] nf_conntrack version 0.5.0 (16384 buckets, 65536 max) [ 25.543634] audit: type=1325 audit(1723643481.535:3): table=nat family=2 entries=0 [ 25.613793] ip6_tables: (C) 2000-2006 Netfilter Core Team [ 25.626458] audit: type=1325 audit(1723643481.618:4): table=filter family=10 entries=0 [ 25.646939] aufs 4.4-20160328 [ 25.654404] bridge: automatic filtering via arp/ip/ip6tables has been deprecated. Update your scripts to load br_netfilter if you need this. [ 25.659781] Bridge firewalling registered [ 25.662800] audit: type=1325 audit(1723643481.654:5): table=mangle family=2 entries=0 [ 25.669616] IPVS: Registered protocols (TCP, UDP, SCTP, AH, ESP) [ 25.670296] IPVS: Connection hash table configured (size=4096, memory=64Kbytes) [ 25.671059] IPVS: Creating netns size=2104 id=0 [ 25.671513] IPVS: ipvs loaded. [ 25.674409] IPVS: [rr] scheduler registered. [ 26.256819] audit: type=1325 audit(1723643482.248:6): table=nat family=2 entries=5 [ 26.261736] audit: type=1325 audit(1723643482.253:7): table=filter family=2 entries=4 [ 26.268054] audit: type=1325 audit(1723643482.259:8): table=filter family=2 entries=6 [ 26.272885] audit: type=1325 audit(1723643482.264:9): table=filter family=2 entries=8 [ 26.278644] audit: type=1325 audit(1723643482.270:10): table=filter family=2 entries=10 [ 26.284560] audit: type=1325 audit(1723643482.276:11): table=filter family=2 entries=11 [ 26.292488] Initializing XFRM netlink socket [ 26.296602] Netfilter messages via NETLINK v0.30. [ 26.352490] IPv6: ADDRCONF(NETDEV_UP): docker0: link is not ready [ 26.354743] IPVS: Creating netns size=2104 id=1 [ 31.213855] tcmu daemon: command reply support 1. ❯ ERROR: Forcefully terminating QEMU process, reason: 0...

❯ Shutdown completed!

Screenshots (optional)

No response

carloscz commented 1 month ago

Is there some way to display internal logs from syno machine ( log center displays nothing interesting - screenshot attached ) I am suspecting it choses to shut down on its own ( or because some problem it encounters ) for some reason :(

image

EDIT: run OK through the night when doing nothing, any idea how to debug the backup shutdown? I don't have problem with completely nuking the system partition and set up all again, but I would like to know why the problem is/was so I don't run into that again ( 2T backup is over 20 days for me and doing that every few moths would suck :/ )