NethermindEth / juno

Starknet client implementation.
https://juno.nethermind.io
Apache License 2.0
387 stars 167 forks source link

Panic in p2p sync #1719

Closed eliotstock closed 2 months ago

eliotstock commented 7 months ago

docker cmd line:

docker run -d   --name juno_p2p   -p 6060:6060   -p 7777:7777   -v /data/juno/sepolia:/var/lib/juno   nethermind/juno:v0.10.0   --db-path "/var/lib/juno"   --network "sepolia"   --log-level "debug"   --http   --http-host "0.0.0.0"   --http-port "6060"  --p2p   --p2p-addr /ip4/0.0.0.0/tcp/7777   --p2p-peers=/ip4/34.138.100.215/tcp/7777/p2p/12D3KooWR8ikUDiinyE5wgdYiqsdLfJRsBDYKGii6L3oyoipVEaV

Console log:

19:21:44.929 22/02/2024 +00:00  DEBUG   p2p/sync.go:86  Continuous iteration    {"i": 10623}
19:21:44.929 22/02/2024 +00:00  DEBUG   p2p/sync.go:630 Number of peers {"len": 1}
19:21:44.929 22/02/2024 +00:00  DEBUG   p2p/sync.go:631 Random chosen peer's info   {"peerInfo": "{12D3KooWR8ikUDiinyE5wgdYiqsdLfJRsBDYKGii6L3oyoipVEaV: []}"}
19:21:48.401 22/02/2024 +00:00  DEBUG   starknet/client.go:79   Error while reading from stream {"err": "stream reset"}
19:21:48.402 22/02/2024 +00:00  INFO    node/node.go:357    Shutting down Juno...
panic: panic: runtime error: invalid memory address or nil pointer dereference
stacktrace:
goroutine 203 [running]:
runtime/debug.Stack()
    /usr/lib/go-1.21/src/runtime/debug/stack.go:24 +0x5e
github.com/sourcegraph/conc/panics.NewRecovered(0x0?, {0x3dd39e0, 0x5d73670})
    /root/go/pkg/mod/github.com/sourcegraph/conc@v0.3.0/panics/panics.go:59 +0x7d
github.com/sourcegraph/conc/panics.(*Catcher).tryRecover(0xc00016d7b0)
    /root/go/pkg/mod/github.com/sourcegraph/conc@v0.3.0/panics/panics.go:28 +0x65
panic({0x3dd39e0?, 0x5d73670?})
    /usr/lib/go-1.21/src/runtime/panic.go:914 +0x21f
github.com/NethermindEth/juno/p2p.(*syncService).randomNodeHeight(0xc000305130, {0x44efca0, 0xc00211a870})
    /app/p2p/sync.go:68 +0xb3
github.com/NethermindEth/juno/p2p.(*syncService).start(0xc000305130, {0x44efca0?, 0xc000305400?})
    /app/p2p/sync.go:91 +0x276
github.com/NethermindEth/juno/p2p.(*Service).Run(0xc0002e92d0, {0x44efca0, 0xc000305400})
    /app/p2p/p2p.go:212 +0x35b
github.com/NethermindEth/juno/node.(*Node).Run.func3()
    /app/node/node.go:350 +0x63
github.com/sourcegraph/conc/panics.(*Catcher).Try(0xc0004e3980?, 0x0?)
    /root/go/pkg/mod/github.com/sourcegraph/conc@v0.3.0/panics/panics.go:23 +0x48
github.com/sourcegraph/conc.(*WaitGroup).Go.func1()
    /root/go/pkg/mod/github.com/sourcegraph/conc@v0.3.0/waitgroup.go:32 +0x56
created by github.com/sourcegraph/conc.(*WaitGroup).Go in goroutine 1
    /root/go/pkg/mod/github.com/sourcegraph/conc@v0.3.0/waitgroup.go:30 +0x73

goroutine 1 [running]:
github.com/sourcegraph/conc/panics.(*Catcher).Repanic(...)
    /root/go/pkg/mod/github.com/sourcegraph/conc@v0.3.0/panics/panics.go:38
github.com/sourcegraph/conc.(*WaitGroup).Wait(0xc00016d7a0)
    /root/go/pkg/mod/github.com/sourcegraph/conc@v0.3.0/waitgroup.go:42 +0x4a
github.com/NethermindEth/juno/node.(*Node).Run(0xc0002e9420, {0x44efca0, 0xc00046a230})
    /app/node/node.go:358 +0x7f5
main.main.func2(0xc00055af00, {0x4037bab?, 0x7?, 0x402dbe9?})
    /app/cmd/juno/juno.go:197 +0xbb
github.com/spf13/cobra.(*Command).execute(0xc00055af00, {0xc000040110, 0xf, 0xf})
    /root/go/pkg/mod/github.com/spf13/cobra@v1.8.0/command.go:983 +0xabc
github.com/spf13/cobra.(*Command).ExecuteC(0xc00055af00)
    /root/go/pkg/mod/github.com/spf13/cobra@v1.8.0/command.go:1115 +0x3ff
github.com/spf13/cobra.(*Command).Execute(...)
    /root/go/pkg/mod/github.com/spf13/cobra@v1.8.0/command.go:1039
github.com/spf13/cobra.(*Command).ExecuteContext(...)
    /root/go/pkg/mod/github.com/spf13/cobra@v1.8.0/command.go:1032
main.main()
    /app/cmd/juno/juno.go:201 +0x1ee
eliotstock commented 7 months ago

Steps to repro:

  1. Get the latest sepolia snapshot link from the README in this repo and untar it to /data/juno/sepolia in my case
  2. Run with the above docker command line.
  3. Wait around 12 to 24 hours.

As discussed on a Slack, we think this may be caused by my sole remaining peer having disappeared (disapeered?). But obvs that should not cause a panic.

kirugan commented 2 months ago

There is no such panic anymore. We removed randomNodeHeight method