skycoin / skywire

Skywire Node implementation
73 stars 45 forks source link

dmsghttp log server `i/o deadline reached` #1804

Open 0pcom opened 2 months ago

0pcom commented 2 months ago

when performing a similar test as the one described in https://github.com/skycoin/skywire/issues/1803 ; it became apparent that the dmsghttp log server is not an accurate indicator of the visor being connected to dmsg.

Here is the test which was performed

$ _mypk="$(skywire cli visor pk)" ; skywire cli proxy list -c US -v v1.3.19 | while read _pk ; do echo $_pk ;  curl -sLx socks5h://127.0.0.1:4445 http://${_pk}.dmsg/health ; echo ; skywire svc tps add -p -t dmsg -1 $_pk -2 ${_mypk} ; _tpid="$(skywire svc tps list -1 $_pk | jq '.[].ID' | tr -d '"' | head -n1)" ; [[ ${_tpid} != "" ]] && skywire svc tps rm -p -1 $_pk -i $_tpid ; done

All the public keys checked should be of the same version, v1.3.19 and all should respond over dmsghttp to /health checks. However, occasionally results were returned such as this

031f16af33f3674cc2830dce745f8c3213a124d1d1d4938d4558752c8939a8780d
Failed to connect to HTTP server
{
  "ID": "b93fb0ef-0283-04af-b2ad-fac20d881ef8",
  "Local": "031f16af33f3674cc2830dce745f8c3213a124d1d1d4938d4558752c8939a8780d",
  "Remote": "0323272a60895f56aad82cb767fb5c413807adcf7c9fb0578b1b1c5807c7f29d4c",
  "Type": "dmsg"
}
{
  "Result": true
}

As should be apparent, that visor is connected to dmsg because it responded correctly to the transport setup and takedown request. The dmsghttp log server is, at the same time, unresponsive.

Another check of that visor's /health with dmsgcurl

 skywire dmsg curl -l debug dmsg://031f16af33f3674cc2830dce745f8c3213a124d1d1d4938d4558752c8939a8780d:80/health
[2024-04-09T08:57:39.29633081-05:00] DEBUG disc.NewHTTP [dmsgcurl]: Created HTTP client. addr="http://dmsgd.skywire.skycoin.com"
[2024-04-09T08:57:39.296413808-05:00] DEBUG [dmsgcurl]: Connecting to dmsg network... dmsg_disc="http://dmsgd.skywire.skycoin.com" public_key="03dc0cc11fb022f96d20962e89f1871736a3690f32c90706a48e8008681d7fdc88"
[2024-04-09T08:57:39.296508226-05:00] DEBUG [dmsg_client]: Discovering dmsg servers...
[2024-04-09T08:57:39.8061407-05:00] DEBUG [dmsg_client]: Dialing session... remote_pk=0326978f5a53aff537dbb47fed58b1f123af3b00132d365f1309a14db4168dcff7
[2024-04-09T08:57:40.307592762-05:00] DEBUG [dmsg_client]: Serving session. remote_pk=0326978f5a53aff537dbb47fed58b1f123af3b00132d365f1309a14db4168dcff7
[2024-04-09T08:57:40.307637338-05:00] DEBUG [dmsgcurl]: Dmsg network ready.
[2024-04-09T08:58:00.843082584-05:00] DEBUG ClientSession.DialStream [dmsg_client]: Stream closed on failure. close_error=<nil> dst_addr=031f16af33f3674cc2830dce745f8c3213a124d1d1d4938d4558752c8939a8780d:80 error="i/o deadline reached" session=0326978f5a53aff537dbb47fed58b1f123af3b00132d365f1309a14db4168dcff7
[2024-04-09T08:58:00.843188591-05:00] ERROR [dmsgcurl]: error="failed to connect to HTTP server: Get "dmsg://031f16af33f3674cc2830dce745f8c3213a124d1d1d4938d4558752c8939a8780d:80/health": i/o deadline reached"
[2024-04-09T08:58:00.843338565-05:00] DEBUG [dmsg_client]: Stopped serving client!
[2024-04-09T08:58:00.843397403-05:00] DEBUG [dmsg_client]: Stopped accepting streams. error="session shutdown" session=0326978f5a53aff537dbb47fed58b1f123af3b00132d365f1309a14db4168dcff7
[2024-04-09T08:58:00.843546798-05:00] DEBUG [dmsg_client]: Session closed. error=<nil>
[2024-04-09T08:58:00.843582033-05:00] DEBUG [dmsg_client]: All sessions closed.
[2024-04-09T08:58:01.085427795-05:00] DEBUG [dmsg_client]: Deleting entry. entry=   version: 0.0.1
    sequence: 0
    registered at: 1712671060064820069
    static public key: 03dc0cc11fb022f96d20962e89f1871736a3690f32c90706a48e8008681d7fdc88
    signature: e906163ed29264c133d95e2d07597b4208ebab39fb36c6924fb6349bf95043c903eec2d606beb371c5568da7d4ec941dd6fff789abd5f0334df23879f4d3f5c400
    entry is registered as client. Related info: 
        delegated servers: 
            0326978f5a53aff537dbb47fed58b1f123af3b00132d365f1309a14db4168dcff7

[2024-04-09T08:58:01.327718526-05:00] DEBUG [dmsg_client]: Entry Deleted successfully.
[2024-04-09T08:58:01.327757387-05:00] DEBUG [dmsgcurl]: Disconnected from dmsg network. error=<nil>

The error from above:

[2024-04-09T08:58:00.843188591-05:00] ERROR [dmsgcurl]: error="failed to connect to HTTP server: Get "dmsg://031f16af33f3674cc2830dce745f8c3213a124d1d1d4938d4558752c8939a8780d:80/health": i/o deadline reached"

Is not an uncommon one to the transport bandwidth and survey collection process, many of such errors can be seen at https://fiber.skywire.dev/log-collection