filecoin-project / lassie

A minimal universal retrieval client library for IPFS and Filecoin
Other
109 stars 17 forks source link

Per-retrieval linksystem data-transfer startup error #76

Closed rvagg closed 1 year ago

rvagg commented 1 year ago

Seen at the begining of an autoretrieve restart:

    /go/pkg/mod/github.com/ipfs/go-graphsync@v0.14.0/taskqueue/taskqueue.go:97 +0x3a
created by github.com/ipfs/go-graphsync/taskqueue.(*WorkerTaskQueue).Startup
    /go/pkg/mod/github.com/ipfs/go-graphsync@v0.14.0/taskqueue/taskqueue.go:139 +0x3b5
github.com/ipfs/go-graphsync/taskqueue.(*WorkerTaskQueue).worker(0xc003b74320, {0x3af2880, 0xc0018369c0})
    /go/pkg/mod/github.com/ipfs/go-graphsync@v0.14.0/requestmanager/executor/executor.go:83 +0x566
github.com/ipfs/go-graphsync/requestmanager/executor.(*Executor).ExecuteTask(0xc0018369c0, {0x3b05150, 0xc001e08e00}, {0xc03b0de900, 0x26}, 0x1?)
    /go/pkg/mod/github.com/ipfs/go-graphsync@v0.14.0/requestmanager/executor/executor.go:131 +0x1e5
github.com/ipfs/go-graphsync/requestmanager/executor.(*Executor).traverse(0xc018c299c0?, {{0x3b05150, 0xc01e518640}, {0x3b12420, 0xc015e8e690}, {{{0xc03b0de8d0, 0x22}}, {0x3b1bd00, 0xc047e61ae0}, 0x0, ...}, ...})
    /go/pkg/mod/github.com/ipfs/go-graphsync@v0.14.0/requestmanager/reconciledloader/load.go:33 +0x178
github.com/ipfs/go-graphsync/requestmanager/reconciledloader.(*ReconciledLoader).BlockReadOpener(0xc0144aee10, {{0x3b05150, 0xc003d61540}, {{0x0, 0x0, 0x0}}, {0x0, 0x0}, {0x0, 0x0}, ...}, ...)
    /go/pkg/mod/github.com/ipfs/go-graphsync@v0.14.0/requestmanager/reconciledloader/load.go:53 +0x350
github.com/ipfs/go-graphsync/requestmanager/reconciledloader.(*ReconciledLoader).blockReadOpener(0xc0144aee10, {{0x3b05150, 0xc003d61540}, {{0x0, 0x0, 0x0}}, {0x0, 0x0}, {0x0, 0x0}, ...}, ...)
    /go/pkg/mod/github.com/ipfs/go-graphsync@v0.14.0/requestmanager/reconciledloader/load.go:71 +0x61
github.com/ipfs/go-graphsync/requestmanager/reconciledloader.(*ReconciledLoader).loadLocal(0x0?, {{0x3b05150, 0xc003d61540}, {{0x0, 0x0, 0x0}}, {0x0, 0x0}, {0x0, 0x0}, ...}, ...)
goroutine 223 [running]:

[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x184b9c1]
panic: runtime error: invalid memory address or nil pointer dereference

I guess this is the restarts problem suggested in https://github.com/filecoin-project/go-data-transfer/pull/362

This probably isn't an ideal thing to ship a lassie daemon with.

davidd8 commented 1 year ago

How often does this occur, is it easy to reproduce outside of autoretrieve?

rvagg commented 1 year ago

Spotted once in bedrock-dev autoretrieve logs, haven't seen it outside. Haven't really been looking for it in autoretrieve as we've been assuming that the autoretrieve restarts are mostly just OOM problems, but we probably should look more carefully. As per discussion during the week it's probably not related to restarts, and therefore can be easily fixed, but is a real runtime bug that may happen occasionally while in operation so we ought to figure it out.

davidd8 commented 1 year ago

Marking as a P2 for now.