Open wouterokkerse opened 6 months ago
Thanks for the reporting. Have you started a fresh sync or it happened on long running nodes? Can you share the flags you use to run the node? Is it snap or full sync node?
We will try to reproduce it and report back.
It seems to come back even after fresh syncs. It's in snap sync mode.
Start file
# Set the path to the configuration file (assuming it's in the
# current directory)
CONFIG_PATH="./config.toml"
# Check if the configuration file exists
if [ ! -f "$CONFIG_PATH" ]; then echo "Configuration file not found
at $CONFIG_PATH." echo "Please make sure the file exists and try
again." exit 1
fi
# Start the Ethereum Classic Geth node with the specified
# configurations
./geth --classic \
--config "$CONFIG_PATH" \
--cache 1024 \
--metrics \
--identity "ETCMCgethNode" \
--datadir "gethDataDirFastNode" \
--port "30329" \
console
And config toml
# Ethereum Classic Node Configuration
[Eth]
# Network ID for Ethereum Classic
NetworkId = 61
# Sync mode (snap is faster for initial sync)
SyncMode = "snap"
# Other optimization parameters
NoPruning = false
NoPrefetch = false
LightPeers = 50
UltraLightFraction = 75
# Cache size for database in MB
DatabaseCache = 512
I think it maybe has something to do with running in a container. But of course could also be some hardware fault on my pc.
We were still not able to reproduce. We will try again using the config you give above.
We are not able to test on "proxmox lxc containers on a minisforum um480", which might be related.
I tested snap sync on 3 different machines using the v1.12.19 and your provided TOML config file and went smooth. Probably it has to do with the proxmox lxc containers.
If you provide more info on how to setup the same config, I will try to reproduce it.
Just wanted to add into this Issue,
Several Users from our Community report the same, no special config or anything, after a while they run into
goroutine 5575765 [running]:
github.com/ethereum/go-ethereum/p2p/netutil.(DistinctNetSet).AddAddr(0xc000600e40, {{0x7ff6ef15382a?, 0xc000124078?}, 0xc000124078?})
C:/projects/core-geth/p2p/netutil/net.go:266 +0x6a
github.com/ethereum/go-ethereum/p2p/discover.(Table).addIP(0xc000600d88, 0xc03b5381e0, {{0x0, 0xffff8399eec4}, 0xc000124078})
C:/projects/core-geth/p2p/discover/table.go:484 +0xc5
github.com/ethereum/go-ethereum/p2p/discover.(Table).addReplacement(0xc000600d88, 0xc03b5381e0, 0xc00058dcb0)
C:/projects/core-geth/p2p/discover/table.go:550 +0x128
github.com/ethereum/go-ethereum/p2p/discover.(Table).handleAddNode(0xc000600d88, {0xc00058dcb0?, 0x60?, 0xc1?})
C:/projects/core-geth/p2p/discover/table.go:524 +0x31a
github.com/ethereum/go-ethereum/p2p/discover.(Table).loadSeedNodes(0xc000600d88)
C:/projects/core-geth/p2p/discover/table.go:455 +0x105
github.com/ethereum/go-ethereum/p2p/discover.(Table).doRefresh(0xc000600d88, 0xc014aee9c0?)
C:/projects/core-geth/p2p/discover/table.go:429 +0x4e
created by github.com/ethereum/go-ethereum/p2p/discover.(Table).loop in goroutine 4339
C:/projects/core-geth/p2p/discover/table.go:390 +0x669
goroutine 1 [chan receive, 2774 minutes]:
github.com/ethereum/go-ethereum/node.(Node).Wait(...)
C:/projects/core-geth/node/node.go:639
main.geth(0xc0005c6180)
C:/projects/core-geth/cmd/geth/main.go:407 +0x17c
github.com/urfave/cli/v2.(Command).Run(0xc0000be2c0, 0xc0005c6180, {0xc00009e000, 0xa, 0x10})
C:/Users/appveyor/go/pkg/mod/github.com/urfave/cli/v2@v2.25.7/command.go:274 +0x93f
github.com/urfave/cli/v2.(App).RunContext(0xc0004e85a0, {0x7ff6f0d33498, 0x7ff6f1aac160}, {0xc00009e000, 0xa, 0x10})
C:/Users/appveyor/go/pkg/mod/github.com/urfave/cli/v2@v2.25.7/app.go:332 +0x566
github.com/urfave/cli/v2.(*App).Run(...)
C:/Users/appveyor/go/pkg/mod/github.com/urfave/cli/v2@v2.25.7/app.go:309
here another user, Some Report after a Recent Windows Update they started to run into it, did not happen on [Persiphone (v1.12.19)] and before (https://github.com/etclabscore/core-geth/releases/tag/v1.12.19)
Till now only Windows Users are Affected,
windows defender off, these we not affected, nodes 5-8 defender still on, these were all affected
Thanks for your report @Nowalski.
Does this happen on geth start or later? Nice finding that it relates with Windows Defender.
We will have a look
Thanks for your report @Nowalski.
Does this happen on geth start or later? Nice finding that it relates with Windows Defender.
We will have a look
it happens later after some time, some report after 10 Hours of running some say after a few hours but never on start,
After continued investigation and feedback from the community, we’ve decided to revert back to v1.12.19 in the next. Two weeks, The issue we’re facing seems to be worsening, and despite our efforts to identify the root cause, it remains elusive. Some users have suggested that Windows Defender might be involved, but even after excluding it and disabling other third-party applications, the problem persists in a random manner. Maybe u guys found the root cause we have over 6500 nodes running it if u need some sort of testing let us know,
Are these machines running pre-compiled binaries, or building geth independently?
Can you please include the output of geth version
?
Your original trace indicates using Go v1.21.6
. Have you tried upgrading to v1.21.12
or later (v1.22.x)?
All 3 stack traces you've shared point to different code locations, which suggests to me that if there is a single underlying issue for them, it's likely environmental.
Are these machines running pre-compiled binaries, or building geth independently?
Can you please include the output of
geth version
?Your original trace indicates using Go
v1.21.6
. Have you tried upgrading tov1.21.12
or later (v1.22.x)?All 3 stack traces you've shared point to different code locations, which suggests to me that if there is a single underlying issue for them, it's likely environmental.
Pre compiled from https://github.com/etclabscore/core-geth/releases/tag/v1.12.20
I get u an output in the next few days when i get home, but i assure it's the latest version mine and those of the community,
I did not manually update go, since the binaries are pre compiled i don't see why i would need doing that anyways, i could try doing that and compiling myself,
The last point yes the errors are all a little different but comes back as somewhere deeper down the line and nobody had any issue before this release, since allot report it i can't pin point exactly what it's causing it, i had it randomly myself as others too, out of nowhere at some point after a while killing geth.exe and starting it up again was the only solution,
I am running a few etc nodes in proxmox lxc containers on a minisforum um480 and they all seem to fail with the same error at some point. Already checked ram wit memtest even replace ram resynced database but it keeps happening. Core-geth version is latest release. I wonder if it has to do with the chain reorg detected a few times before the crash.