near / near-workspaces-js

Write tests once, run them both on NEAR TestNet and a controlled NEAR Sandbox local environment
https://near.github.io/near-workspaces-js/
GNU General Public License v3.0
42 stars 22 forks source link

feat-issue-215: increase randoness to decrease the rate of port colli… #219

Closed fospring closed 10 months ago

fospring commented 10 months ago

https://github.com/near/workspaces-js/issues/215

fospring commented 10 months ago

when I debug, I found error log:

  Error (TypedError) {
    context: undefined,
    type: 'KeyNotFound',
    message: 'Can not sign transactions for account test.near on network sandbox, no matching key pair found in InMemorySigner(UnencryptedFileSystemKeyStore(/private/var/folders/gq/vqxz79yn511772w0bmy58ylc0000gn/T/sandbox/fe2348f4-406d-4ebd-a94a-e50030da3931)).',
  }

  TypeError {
    message: 'Cannot read properties of undefined (reading \'tearDown\')',
  }

so I check this log:

cd /private/var/folders/gq/vqxz79yn511772w0bmy58ylc0000gn/T/sandbox/fe2348f4-406d-4ebd-a94a-e50030da3931
cat sandboxServer.log

image

2023-09-09T07:31:23.539383Z  INFO neard: version="trunk" build="dbb72eb" latest_protocol=56
2023-09-09T07:31:23.539955Z  INFO near: Creating new RocksDB database path=/private/var/folders/gq/vqxz79yn511772w0bmy58ylc0000gn/T/sandbox/157d1397-8f81-4813-a141-e046e35705db/data
2023-09-09T07:31:23.985613Z  INFO db: Created a new RocksDB instance. num_instances=1
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Os { code: 48, kind: AddrInUse, message: "Address already in use" }', chain/jsonrpc/src/lib.rs:1370:6
stack backtrace:
   0:        0x102fa0aa0 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::ha9a1a8bf7485458d
   1:        0x1025bb24c - core::fmt::write::h9ee5b099821ae5e1
   2:        0x102f7b658 - std::io::Write::write_fmt::h240449cbf2a536ee
   3:        0x102fa49ec - std::panicking::default_hook::{{closure}}::h321649ccf26de565
   4:        0x102fa5690 - std::panicking::rust_panic_with_hook::hd083a3aa5c934ce6
   5:        0x102fa5284 - std::panicking::begin_panic_handler::{{closure}}::hb5be8aaa10a229ca
   6:        0x102fa51fc - std::sys_common::backtrace::__rust_end_short_backtrace::h68e10e4f00198298
   7:        0x102fa51c8 - _rust_begin_unwind
   8:        0x10362a3b8 - core::panicking::panic_fmt::hffc63a015c61fdde
   9:        0x10362a570 - core::result::unwrap_failed::hd24fd75dfe8b9563
  10:        0x102d8ef60 - nearcore::start_with_config_and_synchronization::ha556b02c47cfb518
  11:        0x102502758 - <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::he512b48d7158df0f
  12:        0x10250b680 - neard::cli::RunCmd::run::h08af4af5cfc2eb4f
  13:        0x102506c58 - neard::cli::NeardCmd::parse_and_run::h1e7ae7805519f4ea
  14:        0x10250fcc4 - neard::main::h7f7d173bfd47ba8f
  15:        0x1024e7090 - std::sys_common::backtrace::__rust_begin_short_backtrace::hf04232258bd82a1c
  16:        0x102519e00 - _main

there was a panic info with Address already in use when I run single testfile it won't occur this error, only when we run many testfiles concurrently it many occur sometimes, so the initRpcPort may conflict between those testcase or already use by other program, so we can increase the range of initPort number to decrease potential port conflict

so change this code:


 static async nextPort(): Promise<number> {
    this.lastPort = await portCheck.nextAvailable(this.lastPort + 1, '0.0.0.0');
    return this.lastPort;
  }

so change this initial port to decrease potential port conflict:

// 5001-60000, increase the range of initialPort to decrease the possibility of port conflict
function initialPort(): number {
--   return Math.max(1024, Math.floor(Math.random() * 10_000));
++  return Math.max(5001, Math.floor(Math.random() * 60_000));
}

  static async nextPort(): Promise<number> {
    this.lastPort = await portCheck.nextAvailable(this.lastPort + 1, '0.0.0.0');
    return this.lastPort;
  }

if I change initialPort to a constant, It will occur above error with one hundred percent when I run two or more testfiles concurrently.

function initialPort(): number {
  return 5000;
}
fospring commented 10 months ago

every Worker.init() testcase will run a near rpc node by SandboxServer.start(), if the port is already in used it will fail to run rpc node: https://github.com/near/workspaces-js/blob/main/packages/js/src/server/server.ts#L139C18-L139C18

  async start(): Promise<SandboxServer> {
    debug('Lifecycle.SandboxServer.start()');
    const args = [
      '--home',
      this.homeDir,
      'run',
      '--rpc-addr',
      `0.0.0.0:${this.port}`,
      '--network-addr',
      `0.0.0.0:${await SandboxServer.nextPort()}`,
    ];
    if (process.env.NEAR_WORKSPACES_DEBUG) {
      const filePath = join(this.homeDir, 'sandboxServer.log');
      debug(`near-sandbox logs writing to file: ${filePath}`);
      const fd = await open(filePath, 'a');
      this.subprocess = spawn(SandboxServer.binPath, args, {
        env: {RUST_BACKTRACE: 'full'},
        // @ts-expect-error FileHandle not assignable to Stream | IOType
        stdio: ['ignore', 'ignore', fd],
      });
      this.subprocess.on('exit', async () => {
        await fd.close();
      });
    } else {
      this.subprocess = spawn(SandboxServer.binPath, args, {
        stdio: ['ignore', 'ignore', 'ignore'],
      });
    }
ailisp commented 10 months ago

Excellent analysis and fix! Thank you! Is it possible to completely prevent this error? By making a more robust version nextAvailable, or pre-calculate all ports before tests, or whatever simpler fix.

fospring commented 10 months ago

Excellent analysis and fix! Thank you! Is it possible to completely prevent this error? By making a more robust version nextAvailable, or pre-calculate all ports before tests, or whatever simpler fix.

yep, we need to check whther this port is in used by other programs or other parallel testcases called Worker.int()