get10101 / itchysats

CFDs on Bitcoin.
https://itchysats.network
MIT License
61 stars 17 forks source link

Establishing connection to maker hangs #759

Closed bonomat closed 2 years ago

bonomat commented 2 years ago

The maker seems to be running normally.

Starting the taker, the UI is up but it does not connect to the maker. No logs on either side show that the taker tried to connect

2021-11-30 01:40:46  INFO Lock transaction published with txid c96d8ac6ebed953d885c416b0f7295d27c115f21b340d94a8999ef27e21b1515
2021-11-30 01:40:47 DEBUG Taker actor system ready
2021-11-30 01:40:48  INFO Connected to BitMex realtime API
2021-11-30 01:40:48 DEBUG Resolved umbrel.local:9999 to [192.168.1.76:9999,[fe80::829c:39f3:c31e:35c2]:9999]
2021-11-30 01:40:48  INFO No connection to the maker, attempting to connect:
2021-11-30 01:40:48  WARN 🔧 Configured for release.    
2021-11-30 01:40:48  WARN address: 0.0.0.0    
2021-11-30 01:40:48  WARN port: 8000    
2021-11-30 01:40:48  WARN workers: 4    
2021-11-30 01:40:48  WARN ident: Rocket    
2021-11-30 01:40:48  WARN limits: bytes = 8KiB, data-form = 2MiB, file = 1MiB, form = 32KiB, json = 1MiB, msgpack = 1MiB, string = 8KiB    
2021-11-30 01:40:48  WARN temp dir: /tmp    
2021-11-30 01:40:48  WARN keep-alive: 5s    
2021-11-30 01:40:48  WARN tls: disabled    
2021-11-30 01:40:48  WARN shutdown: ctrlc = true, force = true, signals = [SIGTERM], grace = 2s, mercy = 3s    
2021-11-30 01:40:48  WARN log level: critical    
2021-11-30 01:40:48  WARN cli colors: true    
2021-11-30 01:40:48  WARN 📬 Routes:    
2021-11-30 01:40:48  WARN (index) GET /<_paths..> text/html    
2021-11-30 01:40:48  WARN (dist) GET /assets/<file..>    
2021-11-30 01:40:48  WARN (feed) GET /api/feed    
2021-11-30 01:40:48  WARN (get_health_check) GET /api/alive    
2021-11-30 01:40:48  WARN (post_withdraw_request) POST /api/withdraw    
2021-11-30 01:40:48  WARN (post_order_request) POST /api/cfd/order    
2021-11-30 01:40:48  WARN (margin_calc) POST /api/calculate/margin    
2021-11-30 01:40:48  WARN (post_cfd_action) POST /api/cfd/<id>/<action>    
2021-11-30 01:40:48  WARN 📡 Fairings:    
2021-11-30 01:40:48  WARN Shield (liftoff, response, singleton)    
2021-11-30 01:40:48  INFO 🛡️ Shield:    
2021-11-30 01:40:48  WARN 🚀 Rocket has launched from http://0.0.0.0:8000    
2021-11-30 01:40:52 DEBUG Fetching announcement for /x/BitMEX/BXBT/2021-11-30T03:00:00.price?n=20
2021-11-30 01:40:52 DEBUG Fetching announcement for /x/BitMEX/BXBT/2021-12-01T02:00:00.price?n=20
2021-11-30 01:40:52 DEBUG Fetching announcement for /x/BitMEX/BXBT/2021-11-30T20:00:00.price?n=20
2021-11-30 01:40:52 DEBUG Fetching announcement for /x/BitMEX/BXBT/2021-11-30T08:00:00.price?n=20
2021-11-30 01:40:52 DEBUG Fetching announcement for /x/BitMEX/BXBT/2021-11-30T14:00:00.price?n=20
2021-11-30 01:40:52 DEBUG Fetching announcement for /x/BitMEX/BXBT/2021-11-30T15:00:00.price?n=20
2021-11-30 01:40:52 DEBUG Fetching announcement for /x/BitMEX/BXBT/2021-11-30T21:00:00.price?n=20
2021-11-30 01:40:52 DEBUG Fetching announcement for /x/BitMEX/BXBT/2021-11-30T09:00:00.price?n=20
2021-11-30 01:40:52 DEBUG Fetching announcement for /x/BitMEX/BXBT/2021-11-30T10:00:00.price?n=20
2021-11-30 01:40:52 DEBUG Fetching announcement for /x/BitMEX/BXBT/2021-11-30T04:00:00.price?n=20
2021-11-30 01:40:52 DEBUG Fetching announcement for /x/BitMEX/BXBT/2021-11-30T22:00:00.price?n=20
2021-11-30 01:40:52 DEBUG Fetching announcement for /x/BitMEX/BXBT/2021-11-30T16:00:00.price?n=20
2021-11-30 01:40:52 DEBUG Fetching announcement for /x/BitMEX/BXBT/2021-11-30T11:00:00.price?n=20
2021-11-30 01:40:52 DEBUG Fetching announcement for /x/BitMEX/BXBT/2021-11-30T23:00:00.price?n=20
2021-11-30 01:40:52 DEBUG Fetching announcement for /x/BitMEX/BXBT/2021-11-30T05:00:00.price?n=20
2021-11-30 01:40:52 DEBUG Fetching announcement for /x/BitMEX/BXBT/2021-12-01T00:00:00.price?n=20
2021-11-30 01:40:52 DEBUG Fetching announcement for /x/BitMEX/BXBT/2021-12-01T01:00:00.price?n=20
2021-11-30 01:40:52 DEBUG Fetching announcement for /x/BitMEX/BXBT/2021-11-30T12:00:00.price?n=20
2021-11-30 01:40:52 DEBUG Fetching announcement for /x/BitMEX/BXBT/2021-11-30T13:00:00.price?n=20
2021-11-30 01:40:52 DEBUG Fetching announcement for /x/BitMEX/BXBT/2021-11-30T17:00:00.price?n=20
2021-11-30 01:40:52 DEBUG Fetching announcement for /x/BitMEX/BXBT/2021-11-30T06:00:00.price?n=20
2021-11-30 01:40:52 DEBUG Fetching announcement for /x/BitMEX/BXBT/2021-11-30T07:00:00.price?n=20
2021-11-30 01:40:52 DEBUG Fetching announcement for /x/BitMEX/BXBT/2021-11-30T18:00:00.price?n=20
2021-11-30 01:40:52 DEBUG Fetching announcement for /x/BitMEX/BXBT/2021-11-30T19:00:00.price?n=20
2021-11-30 01:41:06 DEBUG Found relevant Bitcoin transaction txid=c96d8ac6ebed953d885c416b0f7295d27c115f21b340d94a8999ef27e21b1515 status=in mempool
2021-11-30 01:41:45  INFO GET /alive:    
2021-11-30 01:42:15  WARN Error on websocket stream: WebSocket protocol error: Connection reset without closing handshake
2021-11-30 01:42:15  WARN Failed to read quote from websocket
2021-11-30 01:43:20  INFO GET /alive:    
2021-11-30 01:43:45  INFO GET /alive:    
^C2021-11-30 01:44:02  WARN Received SIGINT. Requesting shutdown.    
2021-11-30 01:44:02  INFO Received shutdown request. Waiting for pending I/O...   

maker log

2021-11-30 12:26:36  INFO Initialized logger
2021-11-30 12:26:36  INFO Running version: 0.1.2
2021-11-30 12:26:39  INFO Authentication details: username='maker' password='61e098d3791aaa2c64b90cc67dfceadcaa7100e39a57426a2aa5b23034fb5906', noise_public_key='69a42aa90da8b065b9532b62bff940a3ba07dbbb11d4482c7db83a7e049a9f1e'
2021-11-30 12:26:39  INFO Listening on 0.0.0.0:9999
2021-11-30 12:26:40 ERROR Broadcasting transaction failed. Txid: 83fed9c0677fb30324494598bd5bf6e337f6a737d217893984bb6a4a5ff1ffd3. Raw transaction: 020000000001028d96dd5c90f512f36dc0e00da71b0f31a4ffbffd627b2ed5a17c34d75ef561d90000000000ffffffff92191b86ad41d9c25dafa276331debcd9bf3782f6b2cc6ac9a009c27669fe33f0100000000ffffffff03df65000000000000220020e00980a75447cfeb32059a78c7e5457fae209a2b88bc1055f5dae52c2d6c3491d151020000000000160014f32ec9844f11a0c09e491d8073a35f462234f8e955270100000000001600147ff6eac06f3db0e0db09d6458f88fba22dc6cd9a0247304402200610bc3a508f0e79b69725ce33c8508d750ad7b7a13ed4c1fcf2add622efe5d2022054c974f56d77beae0258305938809a90dbf3fcd97152348df99217e6dfe886820121027a317b986bceb60b5a65a71e11294ab96cdf10f0ae73998d7962b2370838be820247304402205fadab625736750a0da5174efe61866e3886b8c5ae6bbdb119810c55a2035c6b02201cee69e7c9470845c3c3da8c23471174cd1f69035a77b8d3ea9e483d653d4758012102be4bfaa2abac6a187c6d5432da5720a447b315928d3b56c8a9c335db9b9f7ee400000000: Electrum(Protocol(String("sendrawtransaction RPC error: {\"code\":-25,\"message\":\"bad-txns-inputs-missingorspent\"}")))
2021-11-30 12:26:41 DEBUG Fetching attestation for /x/BitMEX/BXBT/2021-11-30T01:00:00.price?n=20
2021-11-30 12:26:41 DEBUG Fetching announcement for /x/BitMEX/BXBT/2021-11-30T03:00:00.price?n=20
2021-11-30 12:26:41 DEBUG Fetching announcement for /x/BitMEX/BXBT/2021-11-30T04:00:00.price?n=20
2021-11-30 12:26:41 DEBUG Fetching announcement for /x/BitMEX/BXBT/2021-11-30T05:00:00.price?n=20
2021-11-30 12:26:41 DEBUG Fetching announcement for /x/BitMEX/BXBT/2021-11-30T06:00:00.price?n=20
2021-11-30 12:26:41 DEBUG Fetching announcement for /x/BitMEX/BXBT/2021-11-30T16:00:00.price?n=20
2021-11-30 12:26:41 DEBUG Fetching announcement for /x/BitMEX/BXBT/2021-11-30T11:00:00.price?n=20
2021-11-30 12:26:41 DEBUG Fetching announcement for /x/BitMEX/BXBT/2021-11-30T07:00:00.price?n=20
2021-11-30 12:26:41 DEBUG Fetching announcement for /x/BitMEX/BXBT/2021-11-30T08:00:00.price?n=20
2021-11-30 12:26:41 DEBUG Fetching announcement for /x/BitMEX/BXBT/2021-11-30T09:00:00.price?n=20
2021-11-30 12:26:41 DEBUG Fetching announcement for /x/BitMEX/BXBT/2021-11-30T21:00:00.price?n=20
2021-11-30 12:26:41 DEBUG Fetching announcement for /x/BitMEX/BXBT/2021-11-30T17:00:00.price?n=20
2021-11-30 12:26:41 DEBUG Fetching announcement for /x/BitMEX/BXBT/2021-11-30T12:00:00.price?n=20
2021-11-30 12:26:41 DEBUG Fetching announcement for /x/BitMEX/BXBT/2021-11-30T10:00:00.price?n=20
2021-11-30 12:26:41 DEBUG Fetching announcement for /x/BitMEX/BXBT/2021-11-30T18:00:00.price?n=20
2021-11-30 12:26:41 DEBUG Fetching announcement for /x/BitMEX/BXBT/2021-11-30T19:00:00.price?n=20
2021-11-30 12:26:41 DEBUG Fetching announcement for /x/BitMEX/BXBT/2021-11-30T20:00:00.price?n=20
2021-11-30 12:26:41 DEBUG Fetching announcement for /x/BitMEX/BXBT/2021-11-30T22:00:00.price?n=20
2021-11-30 12:26:41 DEBUG Fetching announcement for /x/BitMEX/BXBT/2021-11-30T14:00:00.price?n=20
2021-11-30 12:26:41 DEBUG Maker actor system ready
2021-11-30 12:26:41 DEBUG Fetching announcement for /x/BitMEX/BXBT/2021-11-30T23:00:00.price?n=20
2021-11-30 12:26:41 DEBUG Fetching announcement for /x/BitMEX/BXBT/2021-11-30T15:00:00.price?n=20
2021-11-30 12:26:41 DEBUG Fetching announcement for /x/BitMEX/BXBT/2021-12-01T01:00:00.price?n=20
2021-11-30 12:26:41 DEBUG Fetching announcement for /x/BitMEX/BXBT/2021-11-30T13:00:00.price?n=20
2021-11-30 12:26:41 DEBUG Fetching announcement for /x/BitMEX/BXBT/2021-12-01T00:00:00.price?n=20
2021-11-30 12:26:41 DEBUG Fetching announcement for /x/BitMEX/BXBT/2021-12-01T02:00:00.price?n=20
2021-11-30 12:26:42  INFO Fetched new attestation for /x/BitMEX/BXBT/2021-11-30T01:00:00.price?n=20
2021-11-30 12:26:42 DEBUG Learnt latest oracle attestation for event: /x/BitMEX/BXBT/2021-11-30T01:00:00.price?n=20
2021-11-30 12:26:43  INFO Connected to BitMex realtime API
2021-11-30 12:26:43  WARN 🔧 Configured for release.
2021-11-30 12:26:43  WARN address: 0.0.0.0
2021-11-30 12:26:43  WARN port: 8000
2021-11-30 12:26:43  WARN workers: 4
2021-11-30 12:26:43  WARN ident: Rocket
2021-11-30 12:26:43  WARN limits: bytes = 8KiB, data-form = 2MiB, file = 1MiB, form = 32KiB, json = 1MiB, msgpack = 1MiB, string = 8KiB
2021-11-30 12:26:43  WARN temp dir: /tmp
2021-11-30 12:26:43  WARN keep-alive: 5s
2021-11-30 12:26:43  WARN tls: disabled
2021-11-30 12:26:43  WARN shutdown: ctrlc = true, force = true, signals = [SIGTERM], grace = 2s, mercy = 3s
2021-11-30 12:26:43  WARN log level: critical
2021-11-30 12:26:43  WARN cli colors: true
2021-11-30 12:26:43  WARN 📬 Routes:
2021-11-30 12:26:43  WARN (index) GET /<_paths..> text/html
2021-11-30 12:26:43  WARN (dist) GET /assets/<file..>
2021-11-30 12:26:43  WARN (maker_feed) GET /api/feed
2021-11-30 12:26:43  WARN (get_health_check) GET /api/alive
2021-11-30 12:26:43  WARN (post_withdraw_request) POST /api/withdraw
2021-11-30 12:26:43  WARN (post_sell_order) POST /api/order/sell
2021-11-30 12:26:43  WARN (post_cfd_action) POST /api/cfd/<id>/<action>
2021-11-30 12:26:43  WARN 🥅 Catchers:
2021-11-30 12:26:43  WARN (unauthorized) 401
2021-11-30 12:26:43  WARN (unauthorized) /api 401
2021-11-30 12:26:43  WARN 📡 Fairings:
2021-11-30 12:26:43  WARN Shield (liftoff, response, singleton)
2021-11-30 12:26:43  INFO 🛡️ Shield:
2021-11-30 12:26:43  WARN 🚀 Rocket has launched from http://0.0.0.0:8000
2021-11-30 12:26:43 ERROR Error when trying to publish CET: Cannot publish CET in state Pending Open
2021-11-30 12:26:43 ERROR Cannot transition to OpenCommitted because of unexpected state Open Committed
2021-11-30 12:26:43  INFO GET /api/feed text/event-stream:
2021-11-30 12:26:44  INFO POST /api/order/sell application/json:
2021-11-30 12:26:46 DEBUG Noise protocol responder handshake is complete
2021-11-30 12:26:46  INFO New taker connected taker_id=7b451067e235fdf35b758fad8842281c039e7914c5e880e8d608f2f1e09fbb45 address=192.168.1.63:54851
2021-11-30 12:26:52 DEBUG Noise protocol responder handshake is complete
2021-11-30 12:26:52  INFO New taker connected taker_id=2d9ef5b81d303c9bba1715815a7cc5fa5c91af0542b691d02ab31a46bfa40d7a address=116.240.228.91:54971
2021-11-30 12:27:01 DEBUG Found relevant Bitcoin transaction txid=d43db0eb8badcb72b1a3c9e354bb9242e167b03672dd74265ab6f615fa2bf769 status=confirmed with 2 blocks
2021-11-30 12:27:15  INFO POST /api/order/sell application/json:
2021-11-30 12:27:36 ERROR Closing connection to taker 7b451067e235fdf35b758fad8842281c039e7914c5e880e8d608f2f1e09fbb45
2021-11-30 12:27:46 DEBUG Noise protocol responder handshake is complete
2021-11-30 12:27:46  INFO New taker connected taker_id=7b451067e235fdf35b758fad8842281c039e7914c5e880e8d608f2f1e09fbb45 address=192.168.1.63:54858
2021-11-30 12:28:03 DEBUG Noise protocol responder handshake is complete
2021-11-30 12:28:03  INFO New taker connected taker_id=4d6457ba4f39267b08b02d5f03f17dcaed7ddefedd8a39c3636b1e46e1018e06 address=141.168.172.35:61215
2021-11-30 12:28:15  INFO POST /api/order/sell application/json:
2021-11-30 12:28:25 ERROR Closing connection to taker 7b451067e235fdf35b758fad8842281c039e7914c5e880e8d608f2f1e09fbb45
2021-11-30 12:29:01 DEBUG Got notification for new block block_height=2105766
2021-11-30 12:29:01 DEBUG Bitcoin transaction status changed txid=d43db0eb8badcb72b1a3c9e354bb9242e167b03672dd74265ab6f615fa2bf769 new_status=confirmed with 3 blocks old_status=confirmed with 2 blocks
2021-11-30 12:29:07 DEBUG Noise protocol responder handshake is complete
2021-11-30 12:29:07  INFO New taker connected taker_id=ba4a7001bb1debc4e355f971e7e87409281865691ae64cbd9c8cfac1a07da442 address=111.220.66.103:42612
2021-11-30 12:29:15  INFO POST /api/order/sell application/json:
2021-11-30 12:29:15  INFO Failed to send CurrentOrder to taker, removing connection taker_id=7b451067e235fdf35b758fad8842281c039e7914c5e880e8d608f2f1e09fbb45
2021-11-30 12:30:15  WARN Error on websocket stream: WebSocket protocol error: Connection reset without closing handshake
2021-11-30 12:30:15  WARN Failed to read quote from websocket
2021-11-30 12:33:42  INFO GET / text/html:
2021-11-30 12:33:42  INFO GET /assets/index.08fbfe68.css text/css:
2021-11-30 12:33:42  INFO GET /assets/index.478e7f1a.js:
2021-11-30 12:33:42  INFO GET /assets/vendor.68f88f6e.js:
2021-11-30 12:33:42  INFO GET /assets/vendor.68f88f6e.js:
2021-11-30 12:33:42  INFO GET /api/feed text/event-stream:
2021-11-30 12:33:42  INFO GET /favicon.svg image/avif:
2021-11-30 12:33:42 ERROR No matching routes for GET /favicon.svg image/avif.
2021-11-30 12:35:03 ERROR Failed to write message Heartbeat to socket: Broken pipe (os error 32)

2021-11-30 12:36:39  INFO GET / text/html:
2021-11-30 12:36:39  INFO GET /assets/index.478e7f1a.js:
2021-11-30 12:36:39  INFO GET /assets/index.08fbfe68.css text/css:
2021-11-30 12:36:39  INFO GET /assets/vendor.68f88f6e.js:
2021-11-30 12:36:39  INFO GET /assets/vendor.68f88f6e.js:
2021-11-30 12:36:39  INFO GET /api/feed text/event-stream:
2021-11-30 12:36:39  INFO GET /favicon.svg image/avif:
2021-11-30 12:36:39 ERROR No matching routes for GET /favicon.svg image/avif.
2021-11-30 12:38:03  INFO POST /api/order/sell application/json:
2021-11-30 12:38:06  INFO GET / text/html:
2021-11-30 12:38:06  INFO GET /assets/index.478e7f1a.js:
2021-11-30 12:38:06  INFO GET /assets/index.08fbfe68.css text/css:
2021-11-30 12:38:06  INFO GET /assets/vendor.68f88f6e.js:
2021-11-30 12:38:07  INFO GET /assets/vendor.68f88f6e.js:
2021-11-30 12:38:07  INFO GET /api/feed text/event-stream:
2021-11-30 12:38:08  INFO GET /favicon.svg image/avif:
2021-11-30 12:38:08 ERROR No matching routes for GET /favicon.svg image/avif.
2021-11-30 12:38:10  INFO POST /api/order/sell application/json:
2021-11-30 12:40:22  INFO GET / text/html:
2021-11-30 12:40:22  INFO GET /assets/index.478e7f1a.js:
2021-11-30 12:40:22  INFO GET /assets/index.08fbfe68.css text/css:
2021-11-30 12:40:22  INFO GET /assets/vendor.68f88f6e.js:
2021-11-30 12:40:22  INFO GET /assets/vendor.68f88f6e.js:
2021-11-30 12:40:22  INFO GET /api/feed text/event-stream:
2021-11-30 12:40:23  INFO GET /favicon.svg image/avif:
2021-11-30 12:40:23 ERROR No matching routes for GET /favicon.svg image/avif.
2021-11-30 12:40:26  INFO POST /api/order/sell application/json:
2021-11-30 12:40:32  INFO GET / text/html:
2021-11-30 12:40:32  INFO GET /assets/index.478e7f1a.js:
2021-11-30 12:40:32  INFO GET /assets/index.08fbfe68.css text/css:
2021-11-30 12:40:32  INFO GET /assets/vendor.68f88f6e.js:
2021-11-30 12:40:33  INFO GET /assets/vendor.68f88f6e.js:
2021-11-30 12:40:33  INFO GET /api/feed text/event-stream:
2021-11-30 12:40:33  INFO GET /favicon.svg image/avif:
2021-11-30 12:40:33 ERROR No matching routes for GET /favicon.svg image/avif.
2021-11-30 12:43:35  INFO GET / text/html:
2021-11-30 12:43:35  INFO GET /assets/index.478e7f1a.js:
2021-11-30 12:43:35  INFO GET /assets/index.08fbfe68.css text/css:
2021-11-30 12:43:35  INFO GET /assets/vendor.68f88f6e.js:
2021-11-30 12:43:36  INFO GET /assets/vendor.68f88f6e.js:
2021-11-30 12:43:36  INFO GET /api/feed text/event-stream:
2021-11-30 12:43:36  INFO GET /favicon.svg image/avif:
2021-11-30 12:43:36 ERROR No matching routes for GET /favicon.svg image/avif.
thomaseizinger commented 2 years ago

I can see that the taker has been running for only about a minute in your logs. Can you verify that it actually never connected or is it simply just in the process of connecting and has not yet run into a timeout?

thomaseizinger commented 2 years ago

I've updated the title to more closely reflect problem. There is no error reported meaning we don't actually know for sure whether or not it "cannot establish" the connection.

bonomat commented 2 years ago

I can see that the taker has been running for only about a minute in your logs. Can you verify that it actually never connected or is it simply just in the process of connecting and has not yet run into a timeout?

I've updated the log with the last few lines I had before terminating it. In this case it was running just shy of 4 minutes. Maybe @da-kami has some more logs from his taker?

bonomat commented 2 years ago

I've updated the title to more closely reflect problem. There is no error reported meaning we don't actually know for sure whether or not it "cannot establish" the connection.

sounds reasonable.

da-kami commented 2 years ago

Tried to reproduce this, but was unable to at the moment. Stopping here for now.

bonomat commented 2 years ago

wups, didn't want to close this .

It just happened again. The maker seem to have died, maybe a deadlock situation. This is the last log which might be relevant:

2021-11-30 14:57:15  INFO Failed to send CurrentOrder to taker, removing connection taker_id=7b451067e235fdf35b758fad8842281c039e7914c5e880e8d608f2f1e09fbb45
2021-11-30 14:57:16  INFO POST /api/order/sell application/json:
2021-11-30 15:00:04 DEBUG Fetching announcement for /x/BitMEX/BXBT/2021-12-01T05:00:00.price?n=20
2021-11-30 15:00:13 DEBUG Got notification for new block block_height=2105780
2021-11-30 15:20:13 DEBUG Got notification for new block block_height=2105781
2021-11-30 15:29:13 DEBUG Got notification for new block block_height=2105782
2021-11-30 15:49:13 DEBUG Got notification for new block block_height=2105783

afterwards nothing.

da-kami commented 2 years ago

I am having trouble establishing a connection to the maker again. I never see a Connection Established message. This is weird because connecting is wrapped in a timeout now.

@thomaseizinger I think we have a deadlock somewhere, where in some cases messages are just not being passed on. I could not track down where yet.

2021-12-02 22:16:30 DEBUG daemon: Taker actor system ready
2021-12-02 22:16:30 DEBUG daemon::bitmex_price_feed: Connecting to BitMex realtime API
2021-12-02 22:16:31  INFO daemon::bitmex_price_feed: Connected to BitMex realtime API
2021-12-02 22:16:31 DEBUG taker: Resolved 159.196.154.67:9999 to [159.196.154.67:9999]
2021-12-02 22:16:31 DEBUG daemon::connection: No connection to the maker
2021-12-02 22:16:31 DEBUG daemon::connection: Connecting to maker address=159.196.154.67:9999
2021-12-02 22:16:31  WARN rocket::launch: 🔧 Configured for release.
2021-12-02 22:16:31  WARN rocket::launch_: address: 0.0.0.0
2021-12-02 22:16:31  WARN rocket::launch_: port: 8000
2021-12-02 22:16:31  WARN rocket::launch_: workers: 4
2021-12-02 22:16:31  WARN rocket::launch_: ident: Rocket
2021-12-02 22:16:31  WARN rocket::launch_: limits: bytes = 8KiB, data-form = 2MiB, file = 1MiB, form = 32KiB, json = 1MiB, msgpack = 1MiB, string = 8KiB
2021-12-02 22:16:31  WARN rocket::launch_: temp dir: /tmp
2021-12-02 22:16:31  WARN rocket::launch_: keep-alive: 5s
2021-12-02 22:16:31  WARN rocket::launch_: tls: disabled
2021-12-02 22:16:31  WARN rocket::launch_: shutdown: ctrlc = true, force = true, signals = [SIGTERM], grace = 2s, mercy = 3s
2021-12-02 22:16:31  WARN rocket::launch_: log level: critical
2021-12-02 22:16:31  WARN rocket::launch_: cli colors: true
2021-12-02 22:16:31  WARN rocket::launch: 📬 Routes:
2021-12-02 22:16:31  WARN rocket::launch_: (index) GET /<_paths..> text/html
2021-12-02 22:16:31  WARN rocket::launch_: (dist) GET /assets/<file..>
2021-12-02 22:16:31  WARN rocket::launch_: (feed) GET /api/feed
2021-12-02 22:16:31  WARN rocket::launch_: (get_health_check) GET /api/alive
2021-12-02 22:16:31  WARN rocket::launch_: (post_withdraw_request) POST /api/withdraw
2021-12-02 22:16:31  WARN rocket::launch_: (post_order_request) POST /api/cfd/order
2021-12-02 22:16:31  WARN rocket::launch_: (margin_calc) POST /api/calculate/margin
2021-12-02 22:16:31  WARN rocket::launch_: (post_cfd_action) POST /api/cfd/<id>/<action>
2021-12-02 22:16:31  WARN rocket::launch: 📡 Fairings:
2021-12-02 22:16:31  WARN rocket::launch_: Shield (liftoff, response, singleton)
2021-12-02 22:16:31  INFO rocket::shield::shield: 🛡️ Shield:
2021-12-02 22:16:31  WARN rocket::launch: 🚀 Rocket has launched from http://0.0.0.0:8000
2021-12-02 22:16:35  INFO rocket::server: GET /alive:
2021-12-02 22:16:35 DEBUG daemon::oracle: Fetching announcement for /x/BitMEX/BXBT/2021-12-03T03:00:00.price?n=20
2021-12-02 22:16:35 DEBUG daemon::oracle: Fetching announcement for /x/BitMEX/BXBT/2021-12-03T23:00:00.price?n=20
2021-12-02 22:16:35 DEBUG daemon::oracle: Fetching announcement for /x/BitMEX/BXBT/2021-12-03T13:00:00.price?n=20
2021-12-02 22:16:35 DEBUG daemon::oracle: Fetching announcement for /x/BitMEX/BXBT/2021-12-03T14:00:00.price?n=20
2021-12-02 22:16:35 DEBUG daemon::oracle: Fetching announcement for /x/BitMEX/BXBT/2021-12-03T08:00:00.price?n=20
2021-12-02 22:16:35 DEBUG daemon::oracle: Fetching announcement for /x/BitMEX/BXBT/2021-12-03T00:00:00.price?n=20
2021-12-02 22:16:35 DEBUG daemon::oracle: Fetching announcement for /x/BitMEX/BXBT/2021-12-03T09:00:00.price?n=20
2021-12-02 22:16:35 DEBUG daemon::oracle: Fetching announcement for /x/BitMEX/BXBT/2021-12-03T10:00:00.price?n=20
2021-12-02 22:16:35 DEBUG daemon::oracle: Fetching announcement for /x/BitMEX/BXBT/2021-12-03T15:00:00.price?n=20
2021-12-02 22:16:35 DEBUG daemon::oracle: Fetching announcement for /x/BitMEX/BXBT/2021-12-03T11:00:00.price?n=20
2021-12-02 22:16:35 DEBUG daemon::oracle: Fetching announcement for /x/BitMEX/BXBT/2021-12-03T01:00:00.price?n=20
2021-12-02 22:16:35 DEBUG daemon::oracle: Fetching announcement for /x/BitMEX/BXBT/2021-12-03T16:00:00.price?n=20
2021-12-02 22:16:35 DEBUG daemon::oracle: Fetching announcement for /x/BitMEX/BXBT/2021-12-03T02:00:00.price?n=20
2021-12-02 22:16:35 DEBUG daemon::oracle: Fetching announcement for /x/BitMEX/BXBT/2021-12-03T12:00:00.price?n=20
2021-12-02 22:16:35 DEBUG daemon::oracle: Fetching announcement for /x/BitMEX/BXBT/2021-12-03T19:00:00.price?n=20
2021-12-02 22:16:35 DEBUG daemon::oracle: Fetching announcement for /x/BitMEX/BXBT/2021-12-03T20:00:00.price?n=20
2021-12-02 22:16:35 DEBUG daemon::oracle: Fetching announcement for /x/BitMEX/BXBT/2021-12-03T17:00:00.price?n=20
2021-12-02 22:16:35 DEBUG daemon::oracle: Fetching announcement for /x/BitMEX/BXBT/2021-12-03T18:00:00.price?n=20
2021-12-02 22:16:35 DEBUG daemon::oracle: Fetching announcement for /x/BitMEX/BXBT/2021-12-03T21:00:00.price?n=20
2021-12-02 22:16:35 DEBUG daemon::oracle: Fetching announcement for /x/BitMEX/BXBT/2021-12-03T04:00:00.price?n=20
2021-12-02 22:16:35 DEBUG daemon::oracle: Fetching announcement for /x/BitMEX/BXBT/2021-12-03T06:00:00.price?n=20
2021-12-02 22:16:35 DEBUG daemon::oracle: Fetching announcement for /x/BitMEX/BXBT/2021-12-03T07:00:00.price?n=20
2021-12-02 22:16:35 DEBUG daemon::oracle: Fetching announcement for /x/BitMEX/BXBT/2021-12-03T22:00:00.price?n=20
2021-12-02 22:16:35 DEBUG daemon::oracle: Fetching announcement for /x/BitMEX/BXBT/2021-12-03T05:00:00.price?n=20
2021-12-02 22:16:40  INFO rocket::server: GET /alive:
2021-12-02 22:16:45  INFO rocket::server: GET /alive:
2021-12-02 22:16:48  INFO rocket::server: GET / text/html:
2021-12-02 22:16:48  INFO rocket::server: GET /assets/index.151ea18f.js:
2021-12-02 22:16:48  INFO rocket::server: GET /assets/index.08fbfe68.css text/css:
2021-12-02 22:16:48  INFO rocket::server: GET /assets/vendor.fa58af1e.js:
2021-12-02 22:16:48  INFO rocket::server: GET /assets/vendor.fa58af1e.js:
2021-12-02 22:16:48  INFO rocket::server: GET /api/feed text/event-stream:
2021-12-02 22:16:48  INFO rocket::server: GET /assets/logo_nav_bar_black.c442b25d.svg image/avif:
2021-12-02 22:16:48  INFO rocket::server: GET /assets/logo.a37cce62.svg image/avif:
2021-12-02 22:16:55  INFO rocket::server: GET /alive:
...
2021-12-02 22:18:00  INFO rocket::server: GET /alive:
2021-12-02 22:18:15  WARN daemon::bitmex_price_feed: Connection to BitMex realtime API failed: WebSocket protocol error: Connection reset without closing handshake: Connection reset without closing handshake
2021-12-02 22:18:15 DEBUG daemon::bitmex_price_feed: Connecting to BitMex realtime API
2021-12-02 22:18:16  INFO daemon::bitmex_price_feed: Connected to BitMex realtime API
2021-12-02 22:18:35  INFO rocket::server: GET /alive:
...
2021-12-02 22:20:10  INFO rocket::server: GET /alive:
2021-12-02 22:20:15  WARN daemon::bitmex_price_feed: Connection to BitMex realtime API failed: WebSocket protocol error: Connection reset without closing handshake: Connection reset without closing handshake
2021-12-02 22:20:15 DEBUG daemon::bitmex_price_feed: Connecting to BitMex realtime API
2021-12-02 22:20:16  INFO daemon::bitmex_price_feed: Connected to BitMex realtime API
2021-12-02 22:20:20  INFO rocket::server: GET /alive:
...
2021-12-02 22:20:55  INFO rocket::server: GET /alive:

Update:

I attached myself to the maker logs. The maker never wittnessed a taker trying to connect. I restarted the maker. Then the taker was able to connect. So far unable to understand / reproduce what would cause the maker side to not accept connections anymore.

After going through the maker logs I was unable to find a specific error that would help me understand what happened. There are some errors in the logs that could be relevant, e.g. :

Dec 02 12:34:55 umbrel d6fd68146e9a[772]: 2021-12-02 12:34:55 ERROR daemon::send_to_socket: Failed to write message Heartbeat to socket: Broken pipe (os error 32)

but when going through the logs prior to this error there does not seem to be a specific trigger:

Dec 02 12:08:54 umbrel d6fd68146e9a[772]: 2021-12-02 12:08:54 DEBUG daemon::noise: Noise protocol responder handshake is complete
Dec 02 12:08:54 umbrel d6fd68146e9a[772]: 2021-12-02 12:08:54  INFO daemon::maker_inc_connections: New taker connected taker_id=8eed738193d5c3354d023521ac40c35d85333d5a32eb6ad04cd0979d8dcbf642 address=141.168.172.35:53109
Dec 02 12:08:54 umbrel d6fd68146e9a[772]: 2021-12-02 12:08:54  WARN daemon::maker_cfd: Taker already connected: Identity(PublicKey(MontgomeryPoint([142, 237, 115, 129, 147, 213, 195, 53, 77, 2, 53, 33, 172, 64, 195, 93, 133, 51, 61, 90, 50, 235, 106, 208, 76, 208, 151, 157, 141, 203, 246, 66])))
Dec 02 12:09:15 umbrel d6fd68146e9a[772]: 2021-12-02 12:09:15  WARN daemon::bitmex_price_feed: Connection to BitMex realtime API failed: WebSocket protocol error: Connection reset without closing handshake: Connection reset without closing handshake
Dec 02 12:09:15 umbrel d6fd68146e9a[772]: 2021-12-02 12:09:15 DEBUG daemon::bitmex_price_feed: Connecting to BitMex realtime API
Dec 02 12:09:16 umbrel d6fd68146e9a[772]: 2021-12-02 12:09:16  INFO daemon::bitmex_price_feed: Connected to BitMex realtime API
Dec 02 12:10:15 umbrel d6fd68146e9a[772]: 2021-12-02 12:10:15  INFO rocket::server: POST /api/order/sell application/json:
Dec 02 12:10:29 umbrel d6fd68146e9a[772]: 2021-12-02 12:10:29 ERROR daemon::maker_inc_connections: Failed to read incoming messages from taker taker_id=ba4a7001bb1debc4e355f971e7e87409281865691ae64cbd9c8cfac1a07da442
Dec 02 12:10:29 umbrel d6fd68146e9a[772]: 2021-12-02 12:10:29  INFO daemon::maker_inc_connections: Dropping connection taker_id=ba4a7001bb1debc4e355f971e7e87409281865691ae64cbd9c8cfac1a07da442
Dec 02 12:11:15 umbrel d6fd68146e9a[772]: 2021-12-02 12:11:15  INFO rocket::server: POST /api/order/sell application/json:
Dec 02 12:12:15 umbrel d6fd68146e9a[772]: 2021-12-02 12:12:15  WARN daemon::bitmex_price_feed: Connection to BitMex realtime API failed: WebSocket protocol error: Connection reset without closing handshake: Connection reset without closing handshake
Dec 02 12:12:15 umbrel d6fd68146e9a[772]: 2021-12-02 12:12:15 DEBUG daemon::bitmex_price_feed: Connecting to BitMex realtime API
Dec 02 12:12:16 umbrel d6fd68146e9a[772]: 2021-12-02 12:12:16  INFO daemon::bitmex_price_feed: Connected to BitMex realtime API
Dec 02 12:13:15 umbrel d6fd68146e9a[772]: 2021-12-02 12:13:15  INFO rocket::server: POST /api/order/sell application/json:
Dec 02 12:14:15 umbrel d6fd68146e9a[772]: 2021-12-02 12:14:15  INFO rocket::server: POST /api/order/sell application/json:
Dec 02 12:15:15 umbrel d6fd68146e9a[772]: 2021-12-02 12:15:15  WARN daemon::bitmex_price_feed: Connection to BitMex realtime API failed: WebSocket protocol error: Connection reset without closing handshake: Connection reset without closing handshake
Dec 02 12:15:15 umbrel d6fd68146e9a[772]: 2021-12-02 12:15:15 DEBUG daemon::bitmex_price_feed: Connecting to BitMex realtime API
Dec 02 12:15:16 umbrel d6fd68146e9a[772]: 2021-12-02 12:15:16  INFO daemon::bitmex_price_feed: Connected to BitMex realtime API
Dec 02 12:15:58 umbrel d6fd68146e9a[772]: 2021-12-02 12:15:58 DEBUG daemon::monitor: Got notification for new block block_height=2105987
Dec 02 12:16:15 umbrel d6fd68146e9a[772]: 2021-12-02 12:16:15  INFO rocket::server: POST /api/order/sell application/json:
Dec 02 12:17:15 umbrel d6fd68146e9a[772]: 2021-12-02 12:17:15  INFO rocket::server: POST /api/order/sell application/json:
Dec 02 12:18:15 umbrel d6fd68146e9a[772]: 2021-12-02 12:18:15  WARN daemon::bitmex_price_feed: Connection to BitMex realtime API failed: WebSocket protocol error: Connection reset without closing handshake: Connection reset without closing handshake
Dec 02 12:18:15 umbrel d6fd68146e9a[772]: 2021-12-02 12:18:15 DEBUG daemon::bitmex_price_feed: Connecting to BitMex realtime API
Dec 02 12:18:16 umbrel d6fd68146e9a[772]: 2021-12-02 12:18:16  INFO daemon::bitmex_price_feed: Connected to BitMex realtime API
Dec 02 12:19:15 umbrel d6fd68146e9a[772]: 2021-12-02 12:19:15  INFO rocket::server: POST /api/order/sell application/json:
Dec 02 12:20:15 umbrel d6fd68146e9a[772]: 2021-12-02 12:20:15  WARN daemon::bitmex_price_feed: Connection to BitMex realtime API failed: WebSocket protocol error: Connection reset without closing handshake: Connection reset without closing handshake
Dec 02 12:20:15 umbrel d6fd68146e9a[772]: 2021-12-02 12:20:15 DEBUG daemon::bitmex_price_feed: Connecting to BitMex realtime API
Dec 02 12:20:16 umbrel d6fd68146e9a[772]: 2021-12-02 12:20:16  INFO daemon::bitmex_price_feed: Connected to BitMex realtime API
Dec 02 12:21:15 umbrel d6fd68146e9a[772]: 2021-12-02 12:21:15  INFO rocket::server: POST /api/order/sell application/json:
Dec 02 12:22:15 umbrel d6fd68146e9a[772]: 2021-12-02 12:22:15  INFO rocket::server: POST /api/order/sell application/json:
Dec 02 12:23:15 umbrel d6fd68146e9a[772]: 2021-12-02 12:23:15  INFO rocket::server: POST /api/order/sell application/json:
Dec 02 12:24:15 umbrel d6fd68146e9a[772]: 2021-12-02 12:24:15  WARN daemon::bitmex_price_feed: Connection to BitMex realtime API failed: WebSocket protocol error: Connection reset without closing handshake: Connection reset without closing handshake
Dec 02 12:24:15 umbrel d6fd68146e9a[772]: 2021-12-02 12:24:15 DEBUG daemon::bitmex_price_feed: Connecting to BitMex realtime API
Dec 02 12:24:16 umbrel d6fd68146e9a[772]: 2021-12-02 12:24:16  INFO daemon::bitmex_price_feed: Connected to BitMex realtime API
Dec 02 12:25:15 umbrel d6fd68146e9a[772]: 2021-12-02 12:25:15  INFO rocket::server: POST /api/order/sell application/json:
Dec 02 12:26:15 umbrel d6fd68146e9a[772]: 2021-12-02 12:26:15  WARN daemon::bitmex_price_feed: Connection to BitMex realtime API failed: WebSocket protocol error: Connection reset without closing handshake: Connection reset without closing handshake
Dec 02 12:26:15 umbrel d6fd68146e9a[772]: 2021-12-02 12:26:15 DEBUG daemon::bitmex_price_feed: Connecting to BitMex realtime API
Dec 02 12:26:16 umbrel d6fd68146e9a[772]: 2021-12-02 12:26:16  INFO daemon::bitmex_price_feed: Connected to BitMex realtime API
Dec 02 12:27:15 umbrel d6fd68146e9a[772]: 2021-12-02 12:27:15  INFO rocket::server: POST /api/order/sell application/json:
Dec 02 12:28:15 umbrel d6fd68146e9a[772]: 2021-12-02 12:28:15  WARN daemon::bitmex_price_feed: Connection to BitMex realtime API failed: WebSocket protocol error: Connection reset without closing handshake: Connection reset without closing handshake
Dec 02 12:28:15 umbrel d6fd68146e9a[772]: 2021-12-02 12:28:15 DEBUG daemon::bitmex_price_feed: Connecting to BitMex realtime API
Dec 02 12:28:16 umbrel d6fd68146e9a[772]: 2021-12-02 12:28:16  INFO daemon::bitmex_price_feed: Connected to BitMex realtime API
Dec 02 12:28:18 umbrel d6fd68146e9a[772]: 2021-12-02 12:28:18 DEBUG daemon::monitor: Got notification for new block block_height=2105988
Dec 02 12:28:59 umbrel d6fd68146e9a[772]: 2021-12-02 12:28:59 ERROR daemon::send_to_socket: Failed to write message Heartbeat to socket: Broken pipe (os error 32)
Dec 02 12:29:15 umbrel d6fd68146e9a[772]: 2021-12-02 12:29:15  INFO rocket::server: POST /api/order/sell application/json:
Dec 02 12:30:15 umbrel d6fd68146e9a[772]: 2021-12-02 12:30:15  WARN daemon::bitmex_price_feed: Connection to BitMex realtime API failed: WebSocket protocol error: Connection reset without closing handshake: Connection reset without closing handshake
Dec 02 12:30:15 umbrel d6fd68146e9a[772]: 2021-12-02 12:30:15 DEBUG daemon::bitmex_price_feed: Connecting to BitMex realtime API
Dec 02 12:30:16 umbrel d6fd68146e9a[772]: 2021-12-02 12:30:16  INFO daemon::bitmex_price_feed: Connected to BitMex realtime API
Dec 02 12:31:15 umbrel d6fd68146e9a[772]: 2021-12-02 12:31:15  INFO rocket::server: POST /api/order/sell application/json:
Dec 02 12:32:15 umbrel d6fd68146e9a[772]: 2021-12-02 12:32:15  INFO rocket::server: POST /api/order/sell application/json:
Dec 02 12:33:15 umbrel d6fd68146e9a[772]: 2021-12-02 12:33:15  WARN daemon::bitmex_price_feed: Connection to BitMex realtime API failed: WebSocket protocol error: Connection reset without closing handshake: Connection reset without closing handshake
Dec 02 12:33:15 umbrel d6fd68146e9a[772]: 2021-12-02 12:33:15 DEBUG daemon::bitmex_price_feed: Connecting to BitMex realtime API
Dec 02 12:33:16 umbrel d6fd68146e9a[772]: 2021-12-02 12:33:16  INFO daemon::bitmex_price_feed: Connected to BitMex realtime API
Dec 02 12:34:15 umbrel d6fd68146e9a[772]: 2021-12-02 12:34:15  INFO rocket::server: POST /api/order/sell application/json:
Dec 02 12:34:55 umbrel d6fd68146e9a[772]: 2021-12-02 12:34:55 ERROR daemon::send_to_socket: Failed to write message Heartbeat to socket: Broken pipe (os error 32)
Dec 02 12:35:15 umbrel d6fd68146e9a[772]: 2021-12-02 12:35:15  WARN daemon::bitmex_price_feed: Connection to BitMex realtime API failed: WebSocket protocol error: Connection reset without closing handshake: Connection reset without closing handshake
da-kami commented 2 years ago

Ideas for steps forward:

thomaseizinger commented 2 years ago
  • Create and setup a simple application that connects to the maker and records the maker's heartbeats. If the maker cannot be reached then ping us (or ping some task automated to restart the maker).

You could run that tool locally in a cronjob and if it fails to connect, restart the maker as a result. Should be fairly easy to wire together using systemd services :)

Edit: I forgot that we are releasing via docker now. Shouldn't be too hard to also restart the docker container.

da-kami commented 2 years ago
  • Create and setup a simple application that connects to the maker and records the maker's heartbeats. If the maker cannot be reached then ping us (or ping some task automated to restart the maker).

You could run that tool locally in a cronjob and if it fails to connect, restart the maker as a result. Should be fairly easy to wire together using systemd services :)

Edit: I forgot that we are releasing via docker now. Shouldn't be too hard to also restart the docker container.

This is under the assumption that the local connection the maker is broken as well, but I think that is a fair assumption for now :)

da-kami commented 2 years ago

We can't really do anything here at the moment, because the issue is not reproduceable. Removing it from the Umbrel milestone but keeping the ticket open for now.

bonomat commented 2 years ago

@thomaseizinger : this turns out to be a critical bug. We ran into this again where the maker seemly hangs.

It's hard to say if this is the problem but the maker log says that a taker tried to connect twice and keeps connecting

Dec 14 23:48:00 mainnet-maker 40c318dfaba8[358]: 2021-12-15 10:48:00 DEBUG daemon::monitor: Got notification for new block block_height=714184
Dec 14 23:48:15 mainnet-maker 40c318dfaba8[358]: 2021-12-15 10:48:15  INFO rocket::server: POST /api/order/sell application/json:
Dec 14 23:48:45 mainnet-maker 40c318dfaba8[358]: 2021-12-15 10:48:45 DEBUG daemon::noise: Noise protocol responder handshake is complete
Dec 14 23:48:45 mainnet-maker 40c318dfaba8[358]: 2021-12-15 10:48:45  INFO daemon::maker_inc_connections: New taker connected taker_id=c9fcad805164d53e1cd7e3aaab616b30cbe98eb4c279441530bfe28ed9be5375 address=116.240.228.91:33602
Dec 14 23:48:45 mainnet-maker 40c318dfaba8[358]: 2021-12-15 10:48:45  WARN daemon::maker_cfd: Taker already connected: Identity(PublicKey(MontgomeryPoint([201, 252, 173, 128, 81, 100, 213, 62, 28, 215, 227, 170, 171, 97, 107, 48, 203, 233, >
Dec 14 23:49:15 mainnet-maker 40c318dfaba8[358]: 2021-12-15 10:49:15  INFO rocket::server: POST /api/order/sell application/json:
Dec 14 23:50:05 mainnet-maker 40c318dfaba8[358]: 2021-12-15 10:50:05 DEBUG daemon::noise: Noise protocol responder handshake is complete
Dec 14 23:50:05 mainnet-maker 40c318dfaba8[358]: 2021-12-15 10:50:05  INFO daemon::maker_inc_connections: New taker connected taker_id=c9fcad805164d53e1cd7e3aaab616b30cbe98eb4c279441530bfe28ed9be5375 address=x.x.x.x:36862
Dec 14 23:50:05 mainnet-maker 40c318dfaba8[358]: 2021-12-15 10:50:05  WARN daemon::maker_cfd: Taker already connected: Identity(PublicKey(MontgomeryPoint([201, 252, 173, 128, 81, 100, 213, 62, 28, 215, 227, 170, 171, 97, 107, 48, 203, 233, >
Dec 14 23:50:15 mainnet-maker 40c318dfaba8[358]: 2021-12-15 10:50:15  INFO rocket::server: POST /api/order/sell application/json:
D
da-kami commented 2 years ago

@thomaseizinger : this turns out to be a critical bug. We ran into this again where the maker seemly hangs.

It's hard to say if this is the problem but the maker log says that a taker tried to connect twice and keeps connecting

Dec 14 23:48:00 mainnet-maker 40c318dfaba8[358]: 2021-12-15 10:48:00 DEBUG daemon::monitor: Got notification for new block block_height=714184
Dec 14 23:48:15 mainnet-maker 40c318dfaba8[358]: 2021-12-15 10:48:15  INFO rocket::server: POST /api/order/sell application/json:
Dec 14 23:48:45 mainnet-maker 40c318dfaba8[358]: 2021-12-15 10:48:45 DEBUG daemon::noise: Noise protocol responder handshake is complete
Dec 14 23:48:45 mainnet-maker 40c318dfaba8[358]: 2021-12-15 10:48:45  INFO daemon::maker_inc_connections: New taker connected taker_id=c9fcad805164d53e1cd7e3aaab616b30cbe98eb4c279441530bfe28ed9be5375 address=116.240.228.91:33602
Dec 14 23:48:45 mainnet-maker 40c318dfaba8[358]: 2021-12-15 10:48:45  WARN daemon::maker_cfd: Taker already connected: Identity(PublicKey(MontgomeryPoint([201, 252, 173, 128, 81, 100, 213, 62, 28, 215, 227, 170, 171, 97, 107, 48, 203, 233, >
Dec 14 23:49:15 mainnet-maker 40c318dfaba8[358]: 2021-12-15 10:49:15  INFO rocket::server: POST /api/order/sell application/json:
Dec 14 23:50:05 mainnet-maker 40c318dfaba8[358]: 2021-12-15 10:50:05 DEBUG daemon::noise: Noise protocol responder handshake is complete
Dec 14 23:50:05 mainnet-maker 40c318dfaba8[358]: 2021-12-15 10:50:05  INFO daemon::maker_inc_connections: New taker connected taker_id=c9fcad805164d53e1cd7e3aaab616b30cbe98eb4c279441530bfe28ed9be5375 address=x.x.x.x:36862
Dec 14 23:50:05 mainnet-maker 40c318dfaba8[358]: 2021-12-15 10:50:05  WARN daemon::maker_cfd: Taker already connected: Identity(PublicKey(MontgomeryPoint([201, 252, 173, 128, 81, 100, 213, 62, 28, 215, 227, 170, 171, 97, 107, 48, 203, 233, >
Dec 14 23:50:15 mainnet-maker 40c318dfaba8[358]: 2021-12-15 10:50:15  INFO rocket::server: POST /api/order/sell application/json:
D

To add a bit more context from the taker side: Both maker and taker containers had been running without a restart for several days. My taker was running within Umbrel. I brought up the UI and everything seemed fine (in retrospect we noticed that the price to the maker seemed outdated), we did not see any errors. The UI did not report that we are not connected to a maker.

We restarted taker and maker containers. I had one faile

I grepped the logs for errors. I don't see any relevant errors besides one that I had upon restarting (but I think that one is related to a maker restart):

2021-12-15 22:16:11 DEBUG daemon::oracle: Fetching announcement for /x/BitMEX/BXBT/2021-12-22T10:00:00.price?n=20
2021-12-15 22:16:15  WARN daemon::connection: Failed to establish connection: Connection reset by peer (os error 104) address=35.244.124.30:9999
2021-12-15 22:16:15  WARN daemon::connection: Tried connecting to 1 addresses without success, retrying in 5 seconds
2021-12-15 22:16:20 DEBUG daemon::connection: Connecting to maker address=35.244.124.30:9999

There are no error messages concerning a maker disconnect (from 09.12. till 16.12.):

2021-12-09 05:41:51  WARN daemon::bitmex_price_feed: Connection to BitMex realtime API failed: WebSocket protocol error: Connection reset without closing handshake: Connection reset without closing handshake
2021-12-09 05:42:55  WARN daemon::bitmex_price_feed: Connection to BitMex realtime API failed: WebSocket protocol error: Connection reset without closing handshake: Connection reset without closing handshake
2021-12-09 09:11:20  WARN daemon::bitmex_price_feed: Connection to BitMex realtime API failed: WebSocket protocol error: Connection reset without closing handshake: Connection reset without closing handshake
2021-12-09 09:41:52  WARN daemon::bitmex_price_feed: Connection to BitMex realtime API failed: WebSocket protocol error: Connection reset without closing handshake: Connection reset without closing handshake
2021-12-09 14:39:53  WARN daemon::bitmex_price_feed: Connection to BitMex realtime API failed: WebSocket protocol error: Connection reset without closing handshake: Connection reset without closing handshake
2021-12-09 15:12:09  WARN daemon::bitmex_price_feed: Connection to BitMex realtime API failed: WebSocket protocol error: Connection reset without closing handshake: Connection reset without closing handshake
2021-12-09 22:03:57  WARN daemon::bitmex_price_feed: Connection to BitMex realtime API failed: WebSocket protocol error: Connection reset without closing handshake: Connection reset without closing handshake
2021-12-09 23:30:52  WARN daemon::bitmex_price_feed: Connection to BitMex realtime API failed: WebSocket protocol error: Connection reset without closing handshake: Connection reset without closing handshake
2021-12-10 00:03:16  WARN daemon::bitmex_price_feed: Connection to BitMex realtime API failed: WebSocket protocol error: Connection reset without closing handshake: Connection reset without closing handshake
2021-12-10 13:14:25  WARN daemon::bitmex_price_feed: Connection to BitMex realtime API failed: WebSocket protocol error: Connection reset without closing handshake: Connection reset without closing handshake
2021-12-11 14:26:14 ERROR daemon::monitor: Message handler failed: Failed to subscribe to header notifications: Made one or multiple attempts, all errored:
2021-12-13 18:45:53  WARN daemon::bitmex_price_feed: Connection to BitMex realtime API failed: WebSocket protocol error: Connection reset without closing handshake: Connection reset without closing handshake
2021-12-13 19:56:04  WARN daemon::bitmex_price_feed: Connection to BitMex realtime API failed: WebSocket protocol error: Connection reset without closing handshake: Connection reset without closing handshake
2021-12-13 20:06:59  WARN daemon::bitmex_price_feed: Connection to BitMex realtime API failed: WebSocket protocol error: Connection reset without closing handshake: Connection reset without closing handshake
2021-12-14 01:48:31  WARN daemon::bitmex_price_feed: Connection to BitMex realtime API failed: WebSocket protocol error: Connection reset without closing handshake: Connection reset without closing handshake
2021-12-14 01:49:14  WARN daemon::bitmex_price_feed: Connection to BitMex realtime API failed: WebSocket protocol error: Connection reset without closing handshake: Connection reset without closing handshake
2021-12-14 01:49:46  WARN daemon::bitmex_price_feed: Connection to BitMex realtime API failed: WebSocket protocol error: Connection reset without closing handshake: Connection reset without closing handshake
2021-12-14 06:45:20 ERROR daemon::monitor: Message handler failed: Failed to subscribe to header notifications: Made one or multiple attempts, all errored:
2021-12-14 13:38:12  WARN daemon::bitmex_price_feed: Connection to BitMex realtime API failed: WebSocket protocol error: Connection reset without closing handshake: Connection reset without closing handshake
2021-12-14 18:47:21  WARN daemon::bitmex_price_feed: Connection to BitMex realtime API failed: WebSocket protocol error: Connection reset without closing handshake: Connection reset without closing handshake
2021-12-14 19:01:42  WARN daemon::bitmex_price_feed: Connection to BitMex realtime API failed: WebSocket protocol error: Connection reset without closing handshake: Connection reset without closing handshake
2021-12-15 22:16:15  WARN daemon::connection: Failed to establish connection: Connection reset by peer (os error 104) address=35.244.124.30:9999
2021-12-15 23:08:11  WARN daemon::bitmex_price_feed: Connection to BitMex realtime API failed: WebSocket protocol error: Connection reset without closing handshake: Connection reset without closing handshake
2021-12-16 00:48:57  WARN daemon::bitmex_price_feed: Connection to BitMex realtime API failed: WebSocket protocol error: Connection reset without closing handshake: Connection reset without closing handshake
2021-12-16 01:10:49  WARN daemon::bitmex_price_feed: Connection to BitMex realtime API failed: WebSocket protocol error: Connection reset without closing handshake: Connection reset without closing handshake

I still think that the problem is not related to the connection actually being broken, but the maker running into a deadlock somehow, so incoming messages are not processed correctly.

thomaseizinger commented 2 years ago

I still think that the problem is not related to the connection actually being broken, but the maker running into a deadlock somehow, so incoming messages are not processed correctly.

I verify this, we need more logs. None of the above logs show heartbeats for example. It would be good to see whether or not those are still sent or not.

da-kami commented 2 years ago

I still think that the problem is not related to the connection actually being broken, but the maker running into a deadlock somehow, so incoming messages are not processed correctly.

I verify this, we need more logs. None of the above logs show heartbeats for example. It would be good to see whether or not those are still sent or not.

I agree, that is why I run the taker on trace now within Umbrel. Hopefully that will help track it down. I suspect that the taker still receives heartbeats (otherwise I should see a disconnected warning in the UI...), but the maker does not propery process incoming messages. Hopefully I can prove it soon ;)

bonomat commented 2 years ago

For debugging purposes:

RUST_LOG="info,maker=info,rocket=info,daemon::send_to_socket=trace,daemon=info" will make the maker print all messages sent to a socket. We will keep it running for a while to see if there is anything obvious.

bonomat commented 2 years ago

I'm not sure it is related, but today during the demo the connection was dropped:

Maker log

2021-12-17 09:24:09 DEBUG daemon::maker_cfd: Maker accepts order order_id=77dd2f11-d70d-4d52-8f66-189a256baa36
2021-12-17 09:24:09  INFO daemon::setup_maker: Maker accepts an order order_id=77dd2f11-d70d-4d52-8f66-189a256baa36
2021-12-17 09:24:09 TRACE daemon::send_to_socket: Sending 'ConfirmOrder'
2021-12-17 09:24:10  INFO daemon::setup_contract: Exchanged setup parameters
2021-12-17 09:24:10  INFO rocket::server: POST /api/order/sell application/json:
2021-12-17 09:24:11  INFO daemon::setup_contract: Created CFD transactions
2021-12-17 09:24:11 TRACE daemon::send_to_socket: Sending 'Protocol'
2021-12-17 09:24:11 TRACE daemon::send_to_socket: Sending 'CurrentOrder'
2021-12-17 09:24:11 TRACE daemon::send_to_socket: Sending 'CurrentOrder'
2021-12-17 09:24:11 TRACE daemon::send_to_socket: Sending 'CurrentOrder'
2021-12-17 09:24:11 TRACE daemon::send_to_socket: Sending 'Protocol'
2021-12-17 09:24:12 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-17 09:24:13 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-17 09:24:13 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-17 09:24:15  INFO rocket::server: POST /api/order/sell application/json:
2021-12-17 09:24:15 TRACE daemon::send_to_socket: Sending 'CurrentOrder'
2021-12-17 09:24:15 TRACE daemon::send_to_socket: Sending 'CurrentOrder'
2021-12-17 09:24:15 TRACE daemon::send_to_socket: Sending 'CurrentOrder'
2021-12-17 09:24:17 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-17 09:24:18 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-17 09:24:18 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-17 09:24:22 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-17 09:24:23 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-17 09:24:23 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-17 09:24:24 ERROR daemon::maker_inc_connections: Failed to read incoming messages from taker taker_id=9d4eb09b464ebda4c1e41e73b54f12bb8cb3ba8e9e1004af50a7dafcaeb6643a
2021-12-17 09:24:24  INFO daemon::maker_inc_connections: Dropping connection taker_id=9d4eb09b464ebda4c1e41e73b54f12bb8cb3ba8e9e1004af50a7dafcaeb6643a
2021-12-17 09:24:27 TRACE daemon::send_to_socket: Sending 'Heartbeat'

Taker log

2021-12-16 22:24:08  INFO daemon::taker_cfd: Taking current order: Order { id: OrderId(Hyphenated(77dd2f11-d70d-4d52-8f66-189a256baa36)), trading_pair: BtcUsd, position: Short, price: Price(48726.93), min_quantity: Usd(100), max_quantity: Usd(1000), leverage: Leverage(2), liquidation_price: Price(32484.62), creation_timestamp: Timestamp(1639693395), settlement_interval: Duration { seconds: 604800, nanoseconds: 0 }, origin: Theirs, oracle_event_id: BitMexPriceEventId { timestamp: OffsetDateTime { utc_datetime: PrimitiveDateTime { date: Date { year: 2021, ordinal: 357 }, time: Time { hour: 23, minute: 0, second: 0, nanosecond: 0 } }, offset: UtcOffset { hours: 0, minutes: 0, seconds: 0 } }, digits: 20 }, fee_rate: 11 }
2021-12-16 22:24:09  INFO daemon::setup_taker: Order got accepted order_id=77dd2f11-d70d-4d52-8f66-189a256baa36
2021-12-16 22:24:11  INFO daemon::setup_contract: Exchanged setup parameters
2021-12-16 22:24:12  INFO rocket::server: POST /api/calculate/margin application/json:
2021-12-16 22:24:16  INFO rocket::server: POST /api/calculate/margin application/json:
2021-12-16 22:24:17  INFO rocket::server: POST /api/calculate/margin application/json:
2021-12-16 22:24:17  INFO daemon::setup_contract: Created CFD transactions
2021-12-16 22:24:17  INFO daemon::setup_contract: Exchanged CFD transactions
2021-12-16 22:24:23  INFO daemon::setup_contract: Verified all signatures
2021-12-16 22:24:24 DEBUG daemon::connection: No connection to the maker
thread 'rocket-worker-thread' panicked at 'wire messages only to arrive in connected state', daemon/src/connection.rs:332:22
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
2021-12-16 22:24:25 ERROR daemon::taker_cfd: Contract setup failed: Error {
    context: "Failed to send Msg2",
    source: Disconnected,

afterwards, no connection anymore.

bonomat commented 2 years ago

so it happened again, unfortunately I cannot find when it started. The maker keeps sending heartbeats but internally something is broken. For example, I cannot post a new order: The endpoint gets called but no new order is being created (the uuid of the order I get on the feed does not change).

Maker log is pretty much useless, all it shows is the heartbeat:

2021-12-18 19:54:16 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 19:54:20 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 19:54:21 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 19:54:25 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 19:54:26 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 19:54:30 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 19:54:31 TRACE daemon::send_to_socket: Sending 'Heartbeat'

After a restart everthing works again.

bonomat commented 2 years ago

Here some more logs after a restart. It looks like it "just stopped working" after 20:12 . As you can see, Sending 'CurrentOrder' is not logged anymore.

2021-12-18 20:12:15  INFO rocket::server: POST /api/order/sell application/json:
2021-12-18 20:12:15 TRACE daemon::send_to_socket: Sending 'CurrentOrder'
2021-12-18 20:12:15 TRACE daemon::send_to_socket: Sending 'CurrentOrder'
2021-12-18 20:12:15 TRACE daemon::send_to_socket: Sending 'CurrentOrder'
2021-12-18 20:12:17 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:12:19 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:12:20 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:12:22 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:12:24 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:12:25 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:12:27 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:12:29 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:12:30 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:12:32 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:12:34 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:12:35 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:12:37 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:12:39 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:12:40 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:12:42 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:12:44 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:12:45 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:12:47 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:12:49 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:12:50 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:12:52 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:12:54 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:12:55 DEBUG daemon::monitor: Got notification for new block block_height=714642
2021-12-18 20:12:55 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:12:57 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:12:59 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:13:00 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:13:02 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:13:04 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:13:05 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:13:07 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:13:09 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:13:10 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:13:12 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:13:14 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:13:15  INFO rocket::server: POST /api/order/sell application/json:
2021-12-18 20:13:15 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:13:17 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:13:19 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:13:20 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:13:22 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:13:24 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:13:25 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:13:27 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:13:29 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:13:30 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:13:32 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:13:34 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:13:35 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:13:37 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:13:39 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:13:40 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:13:42 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:13:44 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:13:45 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:13:47 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:13:49 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:13:50 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:13:52 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:13:54 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:13:55 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:13:57 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:13:59 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:14:00 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:14:02 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:14:04 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:14:05 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:14:07 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:14:09 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:14:10 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:14:12 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:14:14 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:14:15 DEBUG daemon::monitor: Got notification for new block block_height=714643
2021-12-18 20:14:15 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:14:15  INFO rocket::server: POST /api/order/sell application/json:
2021-12-18 20:14:17 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:14:19 TRACE daemon::send_to_socket: Sending 'Heartbeat'
2021-12-18 20:14:20 TRACE daemon::send_to_socket: Sending 'Heartbeat'

Note: uptime was less than 2h until it happened. Find the relevant logs until this happened here: 20211218195500-20211218210000(1).log

thomaseizinger commented 2 years ago

Can you compile the binary according to these instructions and attach tokio-console when it happens?

That should give us an idea if there is any task that is being blocked.

thomaseizinger commented 2 years ago

In the meantime, I suggest we just restart every hour or so.

bonomat commented 2 years ago

Can you compile the binary according to these instructions and attach tokio-console when it happens?

That should give us an idea if there is any task that is being blocked.

That's hard because this is happening on our server with the released version and I'm not sure if anyone was able to reproduce it locally yet.

thomaseizinger commented 2 years ago

That's hard because this is happening on our server with the released version and I'm not sure if anyone was able to reproduce it locally yet.

Can we not temporarily deploy a fork?

bonomat commented 2 years ago

That's hard because this is happening on our server with the released version and I'm not sure if anyone was able to reproduce it locally yet.

Can we not temporarily deploy a fork?

I'll see if I can reproduce it locally first.

bonomat commented 2 years ago

That's hard because this is happening on our server with the released version and I'm not sure if anyone was able to reproduce it locally yet.

Can we not temporarily deploy a fork?

I'll see if I can reproduce it locally first.

So I have the tokio-console running locally and gave all threads a name. I'll let it running locally to see if the problem happens locally as well but sofar I cannot say that I can learn a whole lot from this console 😬

image

thomaseizinger commented 2 years ago

So I have the tokio-console running locally and gave all threads a name. I'll let it running locally to see if the problem happens locally as well but sofar I cannot say that I can learn a whole lot from this console grimacing

Amazing! Let's see what it produces. I have never used it myself either but hopefully we can learn something from it!

bonomat commented 2 years ago

I had it running locally over night and the problem did not happen :/ My guess, maybe it has something todo with release mode, so I'll rebuild and restart in release mode to see if that helps. What's more worrying is that the quote did not update since Dec 19, 2021 8:49:00 PM, roughly 2h after start.: #925

bonomat commented 2 years ago

Latest change (for the 0.3.1 release) did not help unfortunately. I'll add the automated restart for now.

thomaseizinger commented 2 years ago

Latest change (for the 0.3.1 release) did not help unfortunately. I'll add the automated restart for now.

I'll backport #927 later.

bonomat commented 2 years ago

Uptime ~50 minutes and no more orders being send out :(

2021-12-22 16:23:15  INFO rocket::server: POST /api/order/sell application/json:
2021-12-22 16:23:15 TRACE daemon::maker_inc_connections: Sent new order: Some(OrderId(Hyphenated(fb9cba4f-9f67-4170-b5df-cbd4401704fc))) taker_id=c9fcad805164d53e1cd7e3aaab616b30cbe98eb4c279441530bfe28ed9be5375
2021-12-22 16:23:15 TRACE daemon::maker_inc_connections: Sent new order: Some(OrderId(Hyphenated(fb9cba4f-9f67-4170-b5df-cbd4401704fc))) taker_id=766ff734dad13ae9286969981eb69772f140d0f14bcc37c3fc3a5214e25b1b08
2021-12-22 16:23:15 TRACE daemon::maker_inc_connections: Sent new order: Some(OrderId(Hyphenated(fb9cba4f-9f67-4170-b5df-cbd4401704fc))) taker_id=9d4eb09b464ebda4c1e41e73b54f12bb8cb3ba8e9e1004af50a7dafcaeb6643a
2021-12-22 16:23:15 TRACE daemon::maker_inc_connections: Sent new order: Some(OrderId(Hyphenated(fb9cba4f-9f67-4170-b5df-cbd4401704fc))) taker_id=d4cb45f05d06bfdf277349490200b15dfa69dcff47aab1201836958a5db6163f
2021-12-22 16:24:15  INFO rocket::server: POST /api/order/sell application/json:
2021-12-22 16:24:17 DEBUG daemon::monitor: Got notification for new block block_height=715196
2021-12-22 16:25:15  INFO rocket::server: POST /api/order/sell application/json:
2021-12-22 16:26:15  INFO rocket::server: POST /api/order/sell application/json:
2021-12-22 16:26:57 DEBUG daemon::monitor: Got notification for new block block_height=715197
da-kami commented 2 years ago

This is probably unrelated, but in addition to the maker not reacting the taker does (sometimes) not reconnect (stops reconnecting timely):

2021-12-29 04:52:27  WARN daemon::connection: Failed to establish connection: Connection reset by peer (os error 104) address=35.244.124.30:9999
2021-12-29 04:52:27  WARN daemon::connection: Tried connecting to 1 addresses without success, retrying in 5 seconds
2021-12-29 04:52:32 DEBUG daemon::connection: Connecting to maker address=35.244.124.30:9999
2021-12-29 04:52:32 DEBUG daemon::noise: Noise protocol initiator handshake is complete
2021-12-29 04:52:32  INFO daemon::connection: Established connection to maker address=35.244.124.30:9999
2021-12-29 04:52:42 DEBUG daemon::connection: No connection to the maker
2021-12-29 04:52:42 DEBUG daemon::connection: Connecting to maker address=35.244.124.30:9999
2021-12-29 04:54:12  INFO rocket::server: GET / text/html:
2021-12-29 04:54:12  INFO rocket::server: GET /assets/index.1c44b86e.js:
2021-12-29 04:54:12  INFO rocket::server: GET /assets/index.08fbfe68.css text/css:
2021-12-29 04:54:12  INFO rocket::server: GET /assets/vendor.64f20ccd.js:
2021-12-29 04:54:12  INFO rocket::server: GET /assets/vendor.64f20ccd.js:
2021-12-29 04:54:12  INFO rocket::server: GET /api/feed text/event-stream:
2021-12-29 04:54:12  INFO rocket::server: GET /assets/logo_nav_bar_black.c442b25d.svg image/avif:
2021-12-29 04:54:13  INFO rocket::server: GET /assets/logo.a37cce62.svg image/avif:
2021-12-29 04:55:42 DEBUG daemon::monitor: Got notification for new block block_height=716220
2021-12-29 04:56:20  INFO rocket::server: GET /alive:
2021-12-29 04:56:40  INFO rocket::server: GET /alive:
2021-12-29 04:57:42 DEBUG daemon::monitor: Got notification for new block block_height=716221
2021-12-29 04:57:50  INFO rocket::server: GET /alive:
2021-12-29 04:58:39  INFO rocket::server: GET /alive:
2021-12-29 04:58:45  INFO rocket::server: GET /alive:
2021-12-29 04:58:53  INFO rocket::server: GET /alive:
2021-12-29 04:59:00  INFO rocket::server: GET /alive:
2021-12-29 05:00:02 DEBUG daemon::oracle: Fetching announcement for /x/BitMEX/BXBT/2022-01-05T06:00:00.price?n=20
2021-12-29 05:00:13  WARN daemon::connection: Failed to establish connection: Connection reset by peer (os error 104) address=35.244.124.30:9999
2021-12-29 05:00:13  WARN daemon::connection: Tried connecting to 1 addresses without success, retrying in 5 seconds
2021-12-29 05:00:18 DEBUG daemon::connection: Connecting to maker address=35.244.124.30:9999
2021-12-29 05:00:21 DEBUG daemon::noise: Noise protocol initiator handshake is complete
2021-12-29 05:00:21  INFO daemon::connection: Established connection to maker address=35.244.124.30:9999

Note: Eventually the taker did re-connect because the maker restarted and closed the connection it seems.

bonomat commented 2 years ago

Update: I have a good feeling about the changes introduced in https://github.com/itchysats/itchysats/commit/d0d1f0f835a9179b4029a964e7456ab42048106c. I've disabled the automated restart and it's been running for several hours now without issues: orders are being sent out to all connected takers.

Please monitor your takers and let me know if you see something odd. Otherwise, I think we can close this ticket 🥳

cc: @da-kami / @klochowicz / @thomaseizinger

thomaseizinger commented 2 years ago

Good news, lets optimistically close this then!