MG CI issues - Githubissues

plebhash commented 4 months ago

even though we merged https://github.com/stratum-mining/stratum/pull/905, we still haven't fixed all issues on MG CI

@Fi3 has some suggestions here: https://github.com/stratum-mining/stratum/pull/905#issuecomment-2122146609

cc @lorbax

plebhash commented 4 months ago

while working on https://github.com/stratum-mining/stratum/pull/927, I also noticed some CI issues. While running bad-pool-config-test:

EXECUTING ../../test/message-generator/test/bad-pool-config-test/bad-pool-config-test.json

STD ERR:     Finished dev [unoptimized + debuginfo] target(s) in 0.06s
STD ERR:      Running `target/debug/message_generator_sv2 ../../test/message-generator/mock/template-provider-mock0.json`
STD OUT: 
STD OUT: EXECUTING ../../test/message-generator/mock/template-provider-mock0.json
STD OUT: 
STD ERR: info: cargo-llvm-cov currently setting cfg(coverage); you can opt-out it by passing --no-cfg-coverage
STD ERR:    Compiling pool_sv2 v0.1.1 (/home/runner/work/stratum/stratum/roles/pool)
STD ERR:     Finished dev [optimized + debuginfo] target(s) in 15.71s
STD ERR:      Running `target/llvm-cov-target/debug/pool_sv2 -c ../test/config/pool-mock-tp-bad-coinbase.toml`
STD OUT: 2024-05-23T18:17:30.334654Z  INFO pool_sv2: Pool INITIALIZING with config: "../test/config/pool-mock-tp-bad-coinbase.toml"
STD OUT: 2024-05-23T18:17:30.335399Z ERROR pool_sv2: Failed to get coinbase output: InvalidOutputScript
STD ERR: info: cargo-llvm-cov currently setting cfg(coverage); you can opt-out it by passing --no-cfg-coverage
STD ERR:     Finished dev [optimized + debuginfo] target(s) in 0.11s
STD ERR:      Running `target/llvm-cov-target/debug/pool_sv2 -h -c ../test/config/pool-mock-tp.toml`
STD OUT: 2024-05-23T18:17:30.679278Z ERROR pool_sv2: Usage: -h/--help, -c/--config <path|default pool-config.toml>
STD ERR: info: cargo-llvm-cov currently setting cfg(coverage); you can opt-out it by passing --no-cfg-coverage
STD ERR:     Finished dev [optimized + debuginfo] target(s) in 0.11s
STD ERR:      Running `target/llvm-cov-target/debug/pool_sv2 -c ../test/config/pool-mock-tp-bad-config.toml`
STD OUT: 2024-05-23T18:17:31.020502Z ERROR pool_sv2: Failed to parse config: missing field `listen_address` at line 1 column 1
STD ERR: info: cargo-llvm-cov currently setting cfg(coverage); you can opt-out it by passing --no-cfg-coverage
STD ERR:     Finished dev [optimized + debuginfo] target(s) in 0.11s
STD ERR:      Running `target/llvm-cov-target/debug/pool_sv2 -c ../test/config/no-config-at-this-path.toml`
STD OUT: 2024-05-23T18:17:31.362164Z ERROR pool_sv2: Failed to read config: No such file or directory (os error 2)
STD ERR: info: cargo-llvm-cov currently setting cfg(coverage); you can opt-out it by passing --no-cfg-coverage
STD ERR:     Finished dev [optimized + debuginfo] target(s) in 0.11s
STD ERR:      Running `target/llvm-cov-target/debug/pool_sv2 -c ../test/config/pool-mock-tp.toml`
STD OUT: 2024-05-23T18:17:31.702206Z  INFO pool_sv2: Pool INITIALIZING with config: "../test/config/pool-mock-tp.toml"
STD OUT: 2024-05-23T18:17:31.702740Z  INFO pool_sv2::lib::template_receiver: Connected to template distribution server at 127.0.0.1:8442
STD OUT: 2024-05-23T18:17:31.703177Z DEBUG network_helpers_sv2::noise_connection_tokio: Initializing as downstream for - 127.0.0.1:8442
STD OUT: Initializing as upstream for - 127.0.0.1:4[490](https://github.com/stratum-mining/stratum/actions/runs/9212808881/job/25345564538?pr=927#step:6:491)6
STD OUT: 2024-05-23T18:17:31.709196Z DEBUG network_helpers_sv2::noise_connection_tokio: Noise handshake complete - 127.0.0.1:8442
STD OUT: Noise handshake complete - 127.0.0.1:44906
STD OUT: actiondoc: "This action checks that a Setupconnection message is received"
STD OUT: Working on result 1/1: MatchMessageType: 0 (0x0)
PanicInfo {
    payload: Any { .. },
    message: Some(
        explicit panic,
    ),
    location: Location {
        file: "src/external_commands.rs",
        line: 286,
        col: 21,
    },
    can_unwind: true,
    force_no_backtrace: false,
}
thread 'main' panicked at src/main.rs:438:13:
TEST FAILED
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
mg test failed
Error: Process completed with exit code 1.

plebhash commented 4 months ago

another one while running translation-proxy on #929:

EXECUTING ../../test/message-generator/test/translation-proxy/translation-proxy.json

...
STD OUT: 2024-05-23T19:44:16.766948Z DEBUG roles_logic_sv2::channel_logic::channel_factory: Upstream target: "000002cbd3e84cb8c95196d545017a21b956ff20c06c583a8aaebeee82979711"
STD OUT: 2024-05-23T19:44:16.766954Z DEBUG roles_logic_sv2::channel_logic::channel_factory: Hash: "000062117926ff023b81b40081470d60629b7122e89a344de62e863e97f56a17"
STD OUT: 2024-05-23T19:44:16.766959Z DEBUG translator_sv2::lib::proxy::bridge: SHARE MEETS DOWNSTREAM TARGET
STD OUT: 2024-05-23T19:44:16.766965Z DEBUG translator_sv2::lib::downstream_sv1::downstream: Sending to Mining Device: 127.0.0.1:52420 - "{\"id\":0,\"error\":null,\"result\":true}\n"
RECV Sv2Frame {
    header: Header {
        extension_type: 32768,
        msg_type: 22,
        msg_length: U24(
            40,
        ),
    },
    payload: None,
    serialized: Some(
        Slice {
            offset: 0x00007fbc2a97a010,
            len: 46,
            index: 1,
            shared_state: SharedState(
                128,
            ),
            owned: None,
        },
    ),
}
WRONG MESSAGE TYPE expected: 27 received: 22
STD OUT: 2024-05-23T19:44:16.819583Z ERROR translator_sv2::lib::status: Error: Io(Custom { kind: ConnectionAborted, error: "Connection closed by client" })
STD OUT: 2024-05-23T19:44:16.819617Z  WARN translator_sv2::lib::downstream_sv1::downstream: Downstream: Shutting down sv1 downstream reader
STD OUT: 2024-05-23T19:44:16.819646Z  INFO translator_sv2: HEALTHY message: I/O error: `Custom { kind: ConnectionAborted, error: "Connection closed by client" }
STD OUT: 2024-05-23T19:44:16.819714Z  WARN translator_sv2::lib::downstream_sv1::downstream: Downstream: Shutting down sv1 downstream job notifier for 127.0.0.1:52420
STD OUT: 2024-05-23T19:44:16.819725Z ERROR translator_sv2::lib::status: Error: ChannelErrorReceiver(RecvError)
STD OUT: 2024-05-23T19:44:16.819737Z  WARN translator_sv2::lib::downstream_sv1::downstream: Downstream: Shutting down sv1 downstream writer: 127.0.0.1:52420
STD OUT: 2024-05-23T19:44:16.819742Z  INFO translator_sv2: HEALTHY message: Channel receive error: `RecvError`
STD OUT: 2024-05-23T19:44:16.826333Z  INFO translator_sv2: Interrupt received
Disconnected from client while reading : early eof - 127.0.0.1:53086
PanicInfo {
    payload: Any { .. },
    message: Some(
        test failed!!!,
    ),
    location: Location {
        file: "src/executor.rs",
        line: 786,
        col: 13,
    },
    can_unwind: true,
    force_no_backtrace: false,
}
Disconnecting from client due to error receiving: receiving from an empty and closed channel - 127.0.0.1:53086
thread 'main' panicked at src/main.rs:438:13:
TEST FAILED
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
mg test failed

lorbax commented 4 months ago

I am aware of the two errors. The first actually I don't have any explaination. Perhaps the second it is not an error, we should instead add to the MG the capability to match a message with a type A or a type B.

lorbax commented 4 months ago

The second error (that comes from translation-proxy.json) is just the TProxy that sends UpdateChannel. This is used to update the channel nominal hashrate. If the hashrate of the channel is different from the hashrate in the TProxy config, this message is likely to be sent. This is why we see it often. On the otherhand, translation-proxy.json fails with this error, it is just a false negative. @Fi3 correct me if I am wrong I am going to add the capability above to the MG.

rrybarczyk commented 3 months ago

From the root directory, I ran % source test/message-generator/test/bad-pool-config-test/bad-pool-config-test.sh. I did get the following errors:

EXECUTING ../../test/message-generator/test/bad-pool-config-test/bad-pool-config-test.json

STD ERR:     Finished dev [unoptimized + debuginfo] target(s) in 0.09s
STD ERR:      Running `target/debug/message_generator_sv2 ../../test/message-generator/mock/template-provider-mock0.json`
STD ERR: info: cargo-llvm-cov currently setting cfg(coverage); you can opt-out it by passing --no-cfg-coverage
STD ERR:     Finished dev [optimized + debuginfo] target(s) in 0.12s
STD ERR:      Running `target/llvm-cov-target/debug/pool_sv2 -c ../test/config/pool-mock-tp-bad-coinbase.toml`
STD OUT: 2024-07-02T16:31:56.485043Z  INFO pool_sv2: Pool INITIALIZING with config: "../test/config/pool-mock-tp-bad-coinbase.toml"
STD OUT: 2024-07-02T16:31:56.485121Z ERROR pool_sv2: Failed to get coinbase output: InvalidOutputScript
STD OUT:
STD OUT: EXECUTING ../../test/message-generator/mock/template-provider-mock0.json
STD OUT:
STD ERR: info: cargo-llvm-cov currently setting cfg(coverage); you can opt-out it by passing --no-cfg-coverage
STD ERR:     Finished dev [optimized + debuginfo] target(s) in 0.10s
STD ERR:      Running `target/llvm-cov-target/debug/pool_sv2 -h -c ../test/config/pool-mock-tp.toml`
STD OUT: 2024-07-02T16:31:57.303725Z ERROR pool_sv2: Usage: -h/--help, -c/--config <path|default pool-config.toml>
STD ERR: info: cargo-llvm-cov currently setting cfg(coverage); you can opt-out it by passing --no-cfg-coverage
STD ERR:     Finished dev [optimized + debuginfo] target(s) in 0.11s
STD ERR:      Running `target/llvm-cov-target/debug/pool_sv2 -c ../test/config/pool-mock-tp-bad-config.toml`
STD OUT: 2024-07-02T16:31:58.142590Z ERROR pool_sv2: Failed to parse config: missing field `listen_address` at line 1 column 1
STD ERR: info: cargo-llvm-cov currently setting cfg(coverage); you can opt-out it by passing --no-cfg-coverage
STD ERR:     Finished dev [optimized + debuginfo] target(s) in 0.11s
STD ERR:      Running `target/llvm-cov-target/debug/pool_sv2 -c ../test/config/no-config-at-this-path.toml`
STD OUT: 2024-07-02T16:31:58.996300Z ERROR pool_sv2: Failed to read config: No such file or directory (os error 2)
STD ERR: info: cargo-llvm-cov currently setting cfg(coverage); you can opt-out it by passing --no-cfg-coverage
STD ERR:     Finished dev [optimized + debuginfo] target(s) in 0.10s
STD ERR:      Running `target/llvm-cov-target/debug/pool_sv2 -c ../test/config/pool-mock-tp.toml`
STD OUT: 2024-07-02T16:31:59.821938Z  INFO pool_sv2: Pool INITIALIZING with config: "../test/config/pool-mock-tp.toml"
STD OUT: 2024-07-02T16:31:59.822558Z  INFO pool_sv2::lib::template_receiver: Connected to template distribution server at 127.0.0.1:8442
STD OUT: 2024-07-02T16:31:59.823032Z DEBUG network_helpers_sv2::noise_connection_tokio: Initializing as downstream for - 127.0.0.1:8442
STD OUT: Initializing as upstream for - 127.0.0.1:59322
STD OUT: Noise handshake complete - 127.0.0.1:59322
STD OUT: actiondoc: "This action checks that a Setupconnection message is received"
STD OUT: Working on result 1/1: MatchMessageType: 0 (0x0)

However, despite these errors, my test passes:

Initializing as downstream for - 127.0.0.1:34254
STD OUT: 2024-07-02T16:31:59.830867Z DEBUG network_helpers_sv2::noise_connection_tokio: Initializing as upstream for - 127.0.0.1:59323
STD OUT: 2024-07-02T16:31:59.831802Z DEBUG network_helpers_sv2::noise_connection_tokio: Noise handshake complete - 127.0.0.1:59323
Noise handshake complete - 127.0.0.1:34254
Disconnected from client while reading : early eof - 127.0.0.1:34254
TEST OK
Disconnecting from client due to error receiving: receiving from an empty and closed channel - 127.0.0.1:34254

Which is weird. Here is the full logs: mg-test-bad-pool-config.txt.

average-gary commented 3 months ago

Sampling 10 failed tests I get Run translation-proxy (4 times) and Run bad-pool-config-test (5) as top failures.

plebhash commented 3 months ago

From the root directory, I ran % source test/message-generator/test/bad-pool-config-test/bad-pool-config-test.sh. I did get the following errors:

EXECUTING ../../test/message-generator/test/bad-pool-config-test/bad-pool-config-test.json

STD ERR:     Finished dev [unoptimized + debuginfo] target(s) in 0.09s
STD ERR:      Running `target/debug/message_generator_sv2 ../../test/message-generator/mock/template-provider-mock0.json`
STD ERR: info: cargo-llvm-cov currently setting cfg(coverage); you can opt-out it by passing --no-cfg-coverage
STD ERR:     Finished dev [optimized + debuginfo] target(s) in 0.12s
STD ERR:      Running `target/llvm-cov-target/debug/pool_sv2 -c ../test/config/pool-mock-tp-bad-coinbase.toml`
STD OUT: 2024-07-02T16:31:56.485043Z  INFO pool_sv2: Pool INITIALIZING with config: "../test/config/pool-mock-tp-bad-coinbase.toml"
STD OUT: 2024-07-02T16:31:56.485121Z ERROR pool_sv2: Failed to get coinbase output: InvalidOutputScript
STD OUT:
STD OUT: EXECUTING ../../test/message-generator/mock/template-provider-mock0.json
STD OUT:
STD ERR: info: cargo-llvm-cov currently setting cfg(coverage); you can opt-out it by passing --no-cfg-coverage
STD ERR:     Finished dev [optimized + debuginfo] target(s) in 0.10s
STD ERR:      Running `target/llvm-cov-target/debug/pool_sv2 -h -c ../test/config/pool-mock-tp.toml`
STD OUT: 2024-07-02T16:31:57.303725Z ERROR pool_sv2: Usage: -h/--help, -c/--config <path|default pool-config.toml>
STD ERR: info: cargo-llvm-cov currently setting cfg(coverage); you can opt-out it by passing --no-cfg-coverage
STD ERR:     Finished dev [optimized + debuginfo] target(s) in 0.11s
STD ERR:      Running `target/llvm-cov-target/debug/pool_sv2 -c ../test/config/pool-mock-tp-bad-config.toml`
STD OUT: 2024-07-02T16:31:58.142590Z ERROR pool_sv2: Failed to parse config: missing field `listen_address` at line 1 column 1
STD ERR: info: cargo-llvm-cov currently setting cfg(coverage); you can opt-out it by passing --no-cfg-coverage
STD ERR:     Finished dev [optimized + debuginfo] target(s) in 0.11s
STD ERR:      Running `target/llvm-cov-target/debug/pool_sv2 -c ../test/config/no-config-at-this-path.toml`
STD OUT: 2024-07-02T16:31:58.996300Z ERROR pool_sv2: Failed to read config: No such file or directory (os error 2)
STD ERR: info: cargo-llvm-cov currently setting cfg(coverage); you can opt-out it by passing --no-cfg-coverage
STD ERR:     Finished dev [optimized + debuginfo] target(s) in 0.10s
STD ERR:      Running `target/llvm-cov-target/debug/pool_sv2 -c ../test/config/pool-mock-tp.toml`
STD OUT: 2024-07-02T16:31:59.821938Z  INFO pool_sv2: Pool INITIALIZING with config: "../test/config/pool-mock-tp.toml"
STD OUT: 2024-07-02T16:31:59.822558Z  INFO pool_sv2::lib::template_receiver: Connected to template distribution server at 127.0.0.1:8442
STD OUT: 2024-07-02T16:31:59.823032Z DEBUG network_helpers_sv2::noise_connection_tokio: Initializing as downstream for - 127.0.0.1:8442
STD OUT: Initializing as upstream for - 127.0.0.1:59322
STD OUT: Noise handshake complete - 127.0.0.1:59322
STD OUT: actiondoc: "This action checks that a Setupconnection message is received"
STD OUT: Working on result 1/1: MatchMessageType: 0 (0x0)

However, despite these errors, my test passes:

Initializing as downstream for - 127.0.0.1:34254
STD OUT: 2024-07-02T16:31:59.830867Z DEBUG network_helpers_sv2::noise_connection_tokio: Initializing as upstream for - 127.0.0.1:59323
STD OUT: 2024-07-02T16:31:59.831802Z DEBUG network_helpers_sv2::noise_connection_tokio: Noise handshake complete - 127.0.0.1:59323
Noise handshake complete - 127.0.0.1:34254
Disconnected from client while reading : early eof - 127.0.0.1:34254
TEST OK
Disconnecting from client due to error receiving: receiving from an empty and closed channel - 127.0.0.1:34254

Which is weird. Here is the full logs: mg-test-bad-pool-config.txt.

There's nothing wrong with these logs, as far as I can tell.

I think you're just misinterpreting them. This specific test is called bad-pool-config-test because it is testing a scenario where errors are indeed expected.

Note that all errors logs are coming from STD ERR and STD OUT, which is how MG is internally relaying the logs from the roles executable to the user. So in this specific case, we have a pool_sv executable being loaded with a bad config file. The pool's executable will spit out a bunch of errors over STD ERR (which is the expected behavior that this test is asserting for), and that's what you are seeing there.

The strings TEST OK vs TEST FAILED will indicate if the test passed or not. In this case, the pool was loaded with bad configs, it broke as expected, and test is ok.

pavlenex commented 2 months ago

Closing in favor of its big-brother #1028 where will will keep a track of it

stratum-mining / stratum

MG CI issues #928