mimblewimble / grin-wallet

Grin Wallet
Apache License 2.0
183 stars 133 forks source link

wallet randomly loses access to the local node #561

Open marekyggdrasil opened 3 years ago

marekyggdrasil commented 3 years ago

Describe the bug

During the grinnode.live Winter 2020 Bug Bash Challenge multiple testers (@mojitoo, @chandrashekar10 and myself) observed that at some point wallet is no longer able to reach local node.

@chandrashekar10 report @mojitoo report our reproduction report

The only reason the affected test cases could be completed was ability to use grinnode.live as remote node, otherwise wallets could not get synced.

To Reproduce

We were not able to correlate what caused the local node to become unreachable. For 3 of 9 testers it just started to occur at some point. We cannot provide detailed steps, but according to our observations if sufficient amount of people installs the node and wallet and uses it long enough a third of them will lose access to locally running node.

Expected behavior

We would expect the locally running node to always be reachable as long as it is properly configured.

Screenshots

N/A

Desktop (please complete the following information):

@mojitoo environment

Darwin MacBook-Pro-de-Workstation.local 19.5.0 Darwin Kernel Version 19.5.0: Tue May 26 20:41:44 PDT 2020; root:xnu-6153.121.2~2/RELEASE_X86_64 x86_64

@chandrashekar environment

               Computer Model:  MacBook Air 2019 model
               Operating system: macOS Mojave version 10.14.6
               Rustup version:  rustc 1.48.0 (7eac88abb 2020-11-16)
        Clang version:  clang version 11.0.0
                      Target: x86_64-apple-darwin18.7.0
                      Thread model: posix
        Open SSL:    LibreSSL 2.6.5    

Our environment

Linux 4.19.0-12-amd64 #1 SMP Debian 4.19.152-1 (2020-10-18) x86_64 unknown unknown GNU/Linux

Additional context

@chandrashekar10 report @mojitoo report our reproduction report

mojitoo commented 3 years ago

Wallet V : 5.0.1 Node V : grin-v5.0.0-rc.2-macos OS : MAC OS CATALINA

still getting that Error when the wallet trying to connect to the node.

20210113 16:16:20.732 ERROR grin_wallet_impls::node_clients::http - Error calling get_version: ResponseError error: Cannot parse response 20210113 16:16:20.732 ERROR grin_wallet_impls::node_clients::http - Unable to contact Node to get version info: Client Callback Error: Error calling get_version: ResponseError error: Cannot parse response

jaspervdm commented 3 years ago

Unfortunately I am not able to reproduce

marekyggdrasil commented 3 years ago

Understandable @jaspervdm , it is a weird problem and not easy to reproduce... I think I have an idea, maybe we can use some simple tool to proxy local traffic between wallet and the node and record the exact request from the wallet and response from the node? The error says Cannot parse response, I'm curious what is the response it receives that cannot be parsed.

If we set the log level to info would the exact response received appear in the wallet logs? Maybe that would be worth trying?

bladedoyle commented 3 years ago

I see the node become unresponsive also. I think its several different causes.

One cause for sure is when the node goes into a "rewind" loop it becomes unresponsive to the APIs (and everything else too). ex: 20210114 19:16:33.528 WARN grin_chain::txhashset::txhashset - rewind_single_block: 1 output_pos entries missing for: 000317af4605 at 860660

This issue is being tracked by https://github.com/mimblewimble/grin/issues/3483 It was partly fixed in 5.x but not entirely

bladedoyle commented 3 years ago

Here is an additional possible/likely cause: https://github.com/mimblewimble/grin/issues/3550

mojitoo commented 3 years ago

since that bug, i'am unable to use the wallet on my mac os ( i tried all possible solutions)

davidtavarez commented 2 years ago

Maybe not related. But also I noticed that very often the node fails to send the initial Hand message to perform the Handshake.

davidtavarez commented 2 years ago

Maybe not related. But also I noticed that very often the node fails to send the initial Hand message to perform the Handshake.

After more testing, what I noticed is that the API became unresponsive.

20220516 02:20:23.407 INFO grin_util::logger - log4rs is initialized, file level: Info, stdout level: Warn, min. level: Info
20220516 02:20:23.407 INFO grin - Using configuration file at /root/.grin/main/grin-server.toml
20220516 02:20:23.407 INFO grin - This is Grin version 5.1.1 (git v5.1.1), built for x86_64-unknown-linux-gnu by rustc 1.54.0 (a178d0322 2021-07-26).
20220516 02:20:23.407 INFO grin - Chain: Mainnet
20220516 02:20:23.407 INFO grin - Accept Fee Base: 500000
20220516 02:20:23.407 INFO grin - Future Time Limit: 300
20220516 02:20:23.407 INFO grin - Feature: NRD kernel enabled: false
20220516 02:20:23.407 WARN grin::cmd::server - Starting GRIN w/o UI...
20220516 02:20:23.407 INFO grin_servers::grin::server - Starting server, genesis block: 40adad0aec27
20220516 02:20:27.483 ERROR grin_util::logger -
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Chain(Error { inner:

Other Error: failed to find head hash })': src/bin/cmd/server.rs:77   0: grin_util::logger::send_panic_to_log::{{closure}}
   1: std::panicking::rust_panic_with_hook
             at rustc/a178d0322ce20e33eac124758e837cbd80a6f633/library/std/src/panicking.rs:626
   2: std::panicking::begin_panic_handler::{{closure}}
             at rustc/a178d0322ce20e33eac124758e837cbd80a6f633/library/std/src/panicking.rs:519
   3: std::sys_common::backtrace::__rust_end_short_backtrace
             at rustc/a178d0322ce20e33eac124758e837cbd80a6f633/library/std/src/sys_common/backtrace.rs:141
   4: rust_begin_unwind
             at rustc/a178d0322ce20e33eac124758e837cbd80a6f633/library/std/src/panicking.rs:515
   5: core::panicking::panic_fmt
             at rustc/a178d0322ce20e33eac124758e837cbd80a6f633/library/core/src/panicking.rs:92
   6: core::result::unwrap_failed
             at rustc/a178d0322ce20e33eac124758e837cbd80a6f633/library/core/src/result.rs:1355
   7: grin::cmd::server::start_server_tui
   8: grin::cmd::server::server_command
   9: grin::real_main
  10: grin::main
  11: std::sys_common::backtrace::__rust_begin_short_backtrace
  12: std::rt::lang_start::{{closure}}
  13: core::ops::function::impls::<impl core::ops::function::FnOnce<A> for &F>::call_once
             at rustc/a178d0322ce20e33eac124758e837cbd80a6f633/library/core/src/ops/function.rs:259
      std::panicking::try::do_call
             at rustc/a178d0322ce20e33eac124758e837cbd80a6f633/library/std/src/panicking.rs:401
      std::panicking::try
             at rustc/a178d0322ce20e33eac124758e837cbd80a6f633/library/std/src/panicking.rs:365
      std::panic::catch_unwind
             at rustc/a178d0322ce20e33eac124758e837cbd80a6f633/library/std/src/panic.rs:434
      std::rt::lang_start_internal
             at rustc/a178d0322ce20e33eac124758e837cbd80a6f633/library/std/src/rt.rs:34
  14: main
  15: __libc_start_main
  16: _start