ubuntu / zsys

ZSys daemon and client for zfs systems
GNU General Public License v3.0
301 stars 43 forks source link

zsysd crash after inordinate CPU use #216

Open runejuhl opened 2 years ago

runejuhl commented 2 years ago

Describe the bug I installed zfs-auto-snapshot today to get automatic snapshots of some ZFS volumes not managed by zsys. After this I noticed zsysd using a full CPU core for quite a while -- more than 5 minutes. The CPU use might be caused by the same issue as reported in https://github.com/ubuntu/zsys/issues/207.

I changed the log level for zsysd using zsysdctl service loglevel 2 and got some verbose log entries.

Shortly after zsysd crashed completely, with the following stack trace:

panic: The ZFS transaction object has already been used and Done() was called. It can't be reused
goroutine 29 [running]:
github.com/ubuntu/zsys/internal/zfs.(*Transaction).checkValid(0xc0015a2140)
        github.com/ubuntu/zsys/internal/zfs/zfs.go:286 +0x9f
github.com/ubuntu/zsys/internal/zfs.(*Transaction).Snapshot(0xc0015a2140, 0xc00247c3a0, 0xf, 0xc001f50000, 0x18, 0x0, 0x0, 0x0)
        github.com/ubuntu/zsys/internal/zfs/zfs.go:363 +0x65
github.com/ubuntu/zsys/internal/machines.(*Machines).createSnapshot(0xc001a31760, 0xb4e1e0, 0xc000cde570, 0xc00247c3a0, 0xf, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
        github.com/ubuntu/zsys/internal/machines/snapshot.go:91 +0x8de
github.com/ubuntu/zsys/internal/machines.(*Machines).CreateSystemSnapshot(...)
        github.com/ubuntu/zsys/internal/machines/snapshot.go:19
github.com/ubuntu/zsys/internal/daemon.(*Server).SaveSystemState(0xc001a31760, 0xc0022fa0c0, 0xb52648, 0xc0000da440, 0x0, 0x0)
        github.com/ubuntu/zsys/internal/daemon/state.go:49 +0x2f4
github.com/ubuntu/zsys.(*ZsysLogServer).SaveSystemState(0xc0001bc060, 0xc0022fa0c0, 0xb52698, 0xc0008dba10, 0xc0001bc060, 0xc0008c2c08)
        github.com/ubuntu/zsys/zsys.streamlogger.go:326 +0x1d9
github.com/ubuntu/zsys._Zsys_SaveSystemState_Handler(0xa6f060, 0xc0001bc060, 0xb50210, 0xc000162300, 0xc0001bc060, 0xa1c501)
        github.com/ubuntu/zsys/zsys.pb.go:3051 +0x116
github.com/ubuntu/zsys/internal/streamlogger.ServerIdleTimeoutInterceptor(0xa6f060, 0xc0001bc060, 0xb50210, 0xc000162300, 0xc0008ba1c8, 0xabde48, 0x0, 0x0)
        github.com/ubuntu/zsys/internal/streamlogger/server.go:73 +0xad
github.com/ubuntu/zsys/vendor/google.golang.org/grpc.(*Server).processStreamingRPC(0xc000001980, 0xb53b98, 0xc000483500, 0xc002018100, 0xc00090a180, 0xe7ed20, 0x0, 0x0, 0x0)
        github.com/ubuntu/zsys/vendor/google.golang.org/grpc/server.go:1244 +0x524
github.com/ubuntu/zsys/vendor/google.golang.org/grpc.(*Server).handleStream(0xc000001980, 0xb53b98, 0xc000483500, 0xc002018100, 0x0)
        github.com/ubuntu/zsys/vendor/google.golang.org/grpc/server.go:1317 +0xcc5
github.com/ubuntu/zsys/vendor/google.golang.org/grpc.(*Server).serveStreams.func1.1(0xc0000bbfe0, 0xc000001980, 0xb53b98, 0xc000483500, 0xc002018100)
        github.com/ubuntu/zsys/vendor/google.golang.org/grpc/server.go:722 +0xab
created by github.com/ubuntu/zsys/vendor/google.golang.org/grpc.(*Server).serveStreams.func1
        github.com/ubuntu/zsys/vendor/google.golang.org/grpc/server.go:720 +0xa5

I have a large log dump (16 MB uncompressed) from journald in JSON format and a 7.2 MB report created with ubuntu-bug zsys if you'd like.

To Reproduce No idea, haven't seen it before.

For ubuntu users, please run and copy the following: Let me know if you want the report and the logs, I'll have to make sure it's sanitized first.

Installed versions:

Shellcat-Zero commented 2 years ago

I have this problem as well but on 20.04. The system grinds to a halt when installing new software or doing anything that triggers zsysd.