Node crashing with the following exception [As seen from error reporting]
monitor.ml.Error, Failure, no persistent root identifier found (should have been written already), Raised at Stdlib.failwith in file "stdlib.ml", line 29, characters 17-33, Called from Transition_frontier.load_from_persistence_and_start.(fun) in file "src/lib/transition_frontier/transition_frontier.ml", line 108, characters 8-112, Called from Transition_frontier.load_with_max_length.(fun).continue in file "src/lib/transition_frontier/transition_frontier.ml", line 225, characters 10-297, Called from Transition_frontier.load_with_max_length.(fun) in file "src/lib/transition_frontier/transition_frontier.ml", line 341, characters 8-111, Called from Base__Result.try_with in file "src/result.ml", line 195, characters 9-15, Caught by monitor coda
Raised at Stdlib.failwith in file "stdlib.ml", line 29, characters 17-33, Called from O1trace.exec_thread in file "src/lib/o1trace/o1trace.ml", line 77, characters 6-27, Called from Transition_router.load_frontier.(fun) in file "src/lib/transition_router/transition_router.ml", line 274, characters 4-160, Called from Transition_router.initialize.(fun) in file "src/lib/transition_router/transition_router.ml", line 355, characters 6-160, Called from Async_kernel__Deferred0.bind.(fun) in file "src/deferred0.ml", line 54, characters 64-69, Called from Async_kernelJob_queue.run_job in file "src/job_queue.ml" (inlined), line 128, characters 2-5, Called from Async_kernelJob_queue.run_jobs in file "src/job_queue.ml", line 169, characters 6-47
Steps to Reproduce
Unclear at the moment. Node restarted and crashed.
Expected Result
Node should be able to load persisted frontier and root
Actual Result
Node crashed. Seeing quite a few occurrences of this error. Medium impact as the node can be restarted but needs to bootstrap
How frequently do you see this issue?
Frequently
What is the impact of this issue on your ability to run a node?
Medium
Status
From the error report-
data.sync_status
Listening
data.timestamp
Oct 17, 2023 @ 15:57:54.000
data.uptime_of_node
2.793m
Similar problem.
Around 20 hours the second BP-node keeps crashing.
Server settings: AMD Ryzen 7 3700X, HDD2x SSD M.2 NVMe 1 TB, RAM 64 DDR4.
My report:
coda_crash_report_2023-10-27_14-00-24.514966.tar.gz
Preliminary Checks
Description
Node crashing with the following exception [As seen from error reporting]
monitor.ml.Error, Failure, no persistent root identifier found (should have been written already), Raised at Stdlib.failwith in file "stdlib.ml", line 29, characters 17-33, Called from Transition_frontier.load_from_persistence_and_start.(fun) in file "src/lib/transition_frontier/transition_frontier.ml", line 108, characters 8-112, Called from Transition_frontier.load_with_max_length.(fun).continue in file "src/lib/transition_frontier/transition_frontier.ml", line 225, characters 10-297, Called from Transition_frontier.load_with_max_length.(fun) in file "src/lib/transition_frontier/transition_frontier.ml", line 341, characters 8-111, Called from Base__Result.try_with in file "src/result.ml", line 195, characters 9-15, Caught by monitor coda
Raised at Stdlib.failwith in file "stdlib.ml", line 29, characters 17-33, Called from O1trace.exec_thread in file "src/lib/o1trace/o1trace.ml", line 77, characters 6-27, Called from Transition_router.load_frontier.(fun) in file "src/lib/transition_router/transition_router.ml", line 274, characters 4-160, Called from Transition_router.initialize.(fun) in file "src/lib/transition_router/transition_router.ml", line 355, characters 6-160, Called from Async_kernel__Deferred0.bind.(fun) in file "src/deferred0.ml", line 54, characters 64-69, Called from Async_kernelJob_queue.run_job in file "src/job_queue.ml" (inlined), line 128, characters 2-5, Called from Async_kernelJob_queue.run_jobs in file "src/job_queue.ml", line 169, characters 6-47
Steps to Reproduce
Unclear at the moment. Node restarted and crashed.
Expected Result
Node should be able to load persisted frontier and root
Actual Result
Node crashed. Seeing quite a few occurrences of this error. Medium impact as the node can be restarted but needs to bootstrap
How frequently do you see this issue?
Frequently
What is the impact of this issue on your ability to run a node?
Medium
Status
Additional information
No response