matter-labs / zksync-era

zkSync era
Apache License 2.0
3.09k stars 2.07k forks source link

Bug: run witness_generate meet error. Could not find a protocol version for my commitments. Is gateway running? #984

Closed SuccinctPaul closed 8 months ago

SuccinctPaul commented 8 months ago

πŸ› Bug Report

πŸ“ Description

I'm trying to run the server locally following:

https://github.com/matter-labs/zksync-era/blob/9b701e70a4046cf9a84ce264d71e4ae0e8835f88/prover/prover_fri/README.md#L90-L114

After starting the server and zksync_prover_fri_gateway, when turns to witness_generator, meets error.

πŸ”„ Reproduction Steps

Follow https://github.com/matter-labs/zksync-era/blob/9b701e70a4046cf9a84ce264d71e4ae0e8835f88/prover/prover_fri/README.md to run the whole node and

As the raw zk init has some error, you guys can pick up my bugfix version to run.

https://github.com/ChengYueJia/zksync-era/tree/feat/prover 583e56771275b257ff00724278849eff4fd01676

πŸ€” Expected Behavior

Everything works fine.

😯 Current Behavior

I've added a detailer log around the error place, which shows both the batch_size and protocol_versions are empty. So there are two points to debug.

    // If batch_size is none, it means that the job is 'looping forever' (this is the usual setup in local network).
    // At the same time, we're reading the protocol_version only once at startup - so if there is no protocol version
    // read (this is often due to the fact, that the gateway was started too late, and it didn't put the updated protocol
    // versions into the database) - then the job will simply 'hang forever' and not pick any tasks.
    if opt.batch_size.is_none() && protocol_versions.is_empty() {
        tracing::error!("batch_size is empty: {:?}", opt.batch_size.is_none());
        tracing::error!("protocol_versions is empty: {:?}", protocol_versions.is_empty());
        panic!(
            "Could not find a protocol version for my commitments. Is gateway running?  Maybe you started this job before gateway updated the database? Commitments: {:?}",
            vk_commitments
        );
    }
Debug for protocol_versions.

I've queried the db about the protocol_versions in the prover_fri_protocol_versions table. It shows as below, and matches the query condition in the error log's VerifierParams. (Note: there are a different type, logs values in H256, DB's values in H256::as_bytes)

 id |                 recursion_scheduler_level_vk_hash                  |                    recursion_node_level_vk_hash                    |                    recursion_leaf_level_vk_hash                    |                  recursion_circuits_set_vks_hash                   |         created_at
----+--------------------------------------------------------------------+--------------------------------------------------------------------+--------------------------------------------------------------------+--------------------------------------------------------------------+----------------------------
  2 | \x750d8e21be7555a6841472a5cacd24c75a7ceb34261aea61e72bb7423a7d30fc | \x5a3ef282b21e12fe1f4438e5bb158fc5060b160559c5158c6389d62d9fe3d080 | \x14628525c227822148e718ca1138acfc6d25e759e19452455d89f7f610c3dcb8 | \x0000000000000000000000000000000000000000000000000000000000000000 | 2024-01-31 09:30:27.259594
(1 row)
Debug for batch_size.

πŸ–₯️ Environment

πŸ“‹ Additional Context

πŸ“Ž Log Output

warning: `zksync_witness_generator` (bin "zksync_witness_generator") generated 1 warning
    Finished release [optimized] target(s) in 3m 03s
     Running `target/release/zksync_witness_generator --all_rounds`
2024-01-31T14:09:31.089275Z  INFO zksync_witness_generator: No sentry URL was provided
2024-01-31T14:09:31.093017Z  INFO zksync_dal::connection: Created pool with 50 max connections and None statement timeout
2024-01-31T14:09:31.095859Z  INFO zksync_dal::connection: Created pool with 1 max connections and None statement timeout
2024-01-31T14:09:47.034616Z ERROR zksync_witness_generator: batch_size is empty: true
2024-01-31T14:09:47.034636Z ERROR zksync_witness_generator: protocol_versions is empty: true
thread 'main' panicked at witness_generator/src/main.rs:123:9:
Could not find a protocol version for my commitments. Is gateway running?  Maybe you started this job before gateway updated the database? Commitments: L1VerifierConfig { params: VerifierParams { recursion_node_level_vk_hash: 0x5a3ef282b21e12fe1f4438e5bb158fc5060b160559c5158c6389d62d9fe3d080, recursion_leaf_level_vk_hash: 0x062362cb3eaf1f631406cbe19bf2a2c5d0d9ea69d069309a6003addae9f387be, recursion_circuits_set_vks_hash: 0x0000000000000000000000000000000000000000000000000000000000000000 }, recursion_scheduler_level_vk_hash: 0x750d8e21be7555a6841472a5cacd24c75a7ceb34261aea61e72bb7423a7d30fc }
stack backtrace:
   0:     0x55a4264a751c - std::backtrace_rs::backtrace::libunwind::trace::h68b3f03db065435f
                               at /rustc/5c6a7e71cd66705c31c9af94077901a220f0870c/library/std/src/../../backtrace/src/backtrace/libunwind.rs:93:5
   1:     0x55a4264a751c - std::backtrace_rs::backtrace::trace_unsynchronized::hcc29d957a519c467
                               at /rustc/5c6a7e71cd66705c31c9af94077901a220f0870c/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
   2:     0x55a4264a751c - std::sys_common::backtrace::_print_fmt::h51e5567c7446b80e
                               at /rustc/5c6a7e71cd66705c31c9af94077901a220f0870c/library/std/src/sys_common/backtrace.rs:67:5
   3:     0x55a4264a751c - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::h54890bb4d0b3c330
                               at /rustc/5c6a7e71cd66705c31c9af94077901a220f0870c/library/std/src/sys_common/backtrace.rs:44:22
   4:     0x55a4264d4e7c - core::fmt::rt::Argument::fmt::heae7cd8ccc900aeb
                               at /rustc/5c6a7e71cd66705c31c9af94077901a220f0870c/library/core/src/fmt/rt.rs:138:9
   5:     0x55a4264d4e7c - core::fmt::write::hc761729f1163b685
                               at /rustc/5c6a7e71cd66705c31c9af94077901a220f0870c/library/core/src/fmt/mod.rs:1094:21
   6:     0x55a4264a352e - std::io::Write::write_fmt::hd8de790e313b58ee
                               at /rustc/5c6a7e71cd66705c31c9af94077901a220f0870c/library/std/src/io/mod.rs:1714:15
   7:     0x55a4264a7304 - std::sys_common::backtrace::_print::h1b2adaff5c947b3b
                               at /rustc/5c6a7e71cd66705c31c9af94077901a220f0870c/library/std/src/sys_common/backtrace.rs:47:5
   8:     0x55a4264a7304 - std::sys_common::backtrace::print::hf6a9b81f93e62890
                               at /rustc/5c6a7e71cd66705c31c9af94077901a220f0870c/library/std/src/sys_common/backtrace.rs:34:9
   9:     0x55a4264a913a - std::panicking::panic_hook_with_disk_dump::{{closure}}::h663629ab1a73f13e
                               at /rustc/5c6a7e71cd66705c31c9af94077901a220f0870c/library/std/src/panicking.rs:278:22
  10:     0x55a4264a8e27 - std::panicking::panic_hook_with_disk_dump::h7dc7628c0cba33b1
                               at /rustc/5c6a7e71cd66705c31c9af94077901a220f0870c/library/std/src/panicking.rs:312:9
  11:     0x55a4264a973b - std::panicking::default_hook::h4cb3ec769d1db09c
                               at /rustc/5c6a7e71cd66705c31c9af94077901a220f0870c/library/std/src/panicking.rs:239:5
  12:     0x55a4264a973b - std::panicking::rust_panic_with_hook::hdd1521f6ef2543b7
                               at /rustc/5c6a7e71cd66705c31c9af94077901a220f0870c/library/std/src/panicking.rs:729:13
  13:     0x55a4264a9637 - std::panicking::begin_panic_handler::{{closure}}::h7c4ede115dee30b0
                               at /rustc/5c6a7e71cd66705c31c9af94077901a220f0870c/library/std/src/panicking.rs:621:13
  14:     0x55a4264a7a46 - std::sys_common::backtrace::__rust_end_short_backtrace::habd7cadfbd319a50
                               at /rustc/5c6a7e71cd66705c31c9af94077901a220f0870c/library/std/src/sys_common/backtrace.rs:170:18
  15:     0x55a4264a9382 - rust_begin_unwind
                               at /rustc/5c6a7e71cd66705c31c9af94077901a220f0870c/library/std/src/panickiubuntu@ip-172-31-26-32:/mnt/paul/zksync-era$ make login_db
c/rt.rs:148:20
  30:     0x55a42567a885 - main
  31:     0x7f72ddc29d90 - <unknown>
  32:     0x7f72ddc29e40 - __libc_start_main
  33:     0x55a4255ee945 - _start
  34:                0x0 - <unknown>
(node:58756) [DEP0040] DeprecationWarning: The `punycode` module is deprecated. Please use a userland alternative instead.
(Use `node --trace-deprecation ...` to show where the warning was created)
SuccinctPaul commented 8 months ago

Seems the new version has been fixed it. 0759fb76c4445da394c7236ac17d998deff12d93