chrisleekr / binance-trading-bot

Automated Binance trading bot - Trade multiple cryptocurrencies. Buy low/sell high with Grid Trading. Integrated with TradingView technical analysis
MIT License
5.04k stars 1.09k forks source link

Websocket reconnect #619

Closed dasbts closed 1 year ago

dasbts commented 1 year ago

After looking into https://github.com/chrisleekr/binance-trading-bot/issues/618 I further investigated and found that when internet connection is lost, it seems like the sockets are not reconnecting.

Version

Running Version: v0.0.97 (unspecified) Unspecified is because I am building the images myself locally.

Description

I found in the code that you have looked into reconnecting-websocket NPM library, and tried it but without success. I'm still not sure, but investigating when I find time.

I found so far that ./app/server-binance.js is where the websockets are made, In this case ./app/binance/tickers.js is the issue, but I suppose it has to be changed at all places imported into the server-binance.js.

I found that there have been some updates to the Binance-API-Node NPM package 2018 to enable reconnecting-websocket, (https://github.com/ViewBlock/binance-api-node/pull/49) but I am not sure if it works. I did try to update this library to the latest version (knowing the risk of possible code migration requirements), and then it seemingly reconnected, but started printing in green in my terminal with what looked more like tradingview data. In the GUI I noticed there was only tradingview buy/neutral/sell signal counter updating after reconnecting to internet again.

To Reproduce

  1. docker-compose up
  2. Watch logs
  3. Disconnect internet
  4. Watch logs
  5. Reconnect internet
  6. Watch logs

Expected Behaviours

Reconnecting on step 5 and updating the GUI with new data.

Additional context

I kept getting pretty much only these logs after I get internet access again (on binance-api-node v0.11.43), which might confirm why the GUI works, but tickers don't after losing connectivity:

{"name":"binance-api","version":"0.0.97","hostname":"d349291fcdd1","pid":45,"gitHash":"unspecified","server":"cronjob","level":30,"jobName":"trailingTradeIndicator","msg":"Task is running, skip this tick","time":"2023-04-03T09:26:54.001Z","v":0} binance-bot | {"name":"binance-api","version":"0.0.97","hostname":"d349291fcdd1","pid":45,"gitHash":"unspecified","server":"websocket","level":30,"message":{"type":"Buffer","data":[123,34,99,111,109,109,97,110,100,34,58,34,108,97,116,101,115,116,34,44,34,97,117,116,104,84,111,107,101,110,34,58,34,101,121,74,104,98,71,99,105,79,105,74,73,85,122,73,49,78,105,73,115,73,110,82,53,99,67,73,54,73,107,112,88,86,67,74,57,46,101,121,74,104,100,88,82,111,90,87,53,48,97,87,78,104,100,71,86,107,81,88,81,105,79,105,73,121,77,68,73,122,76,84,65,48,76,84,65,122,86,68,65,52,79,106,81,51,79,106,77,119,76,106,99,51,79,70,111,105,76,67,74,112,89,88,81,105,79,106,69,50,79,68,65,49,77,84,69,50,78,84,65,115,73,109,86,52,99,67,73,54,77,84,89,52,77,68,85,120,79,68,103,49,77,72,48,46,107,97,113,74,66,99,114,104,78,109,80,102,100,51,105,72,77,89,81,104,105,55,115,119,98,115,110,102,117,113,111,111,117,55,72,100,67,119,72,57,70,70,111,34,44,34,100,97,116,97,34,58,123,34,112,97,103,101,34,58,49,44,34,115,101,97,114,99,104,75,101,121,119,111,114,100,34,58,34,34,44,34,115,111,114,116,66,121,34,58,34,100,101,102,97,117,108,116,34,44,34,115,111,114,116,66,121,68,101,115,99,34,58,102,97,108,115,101,44,34,104,105,100,101,73,110,97,99,116,105,118,101,34,58,102,97,108,115,101,125,125]},"msg":"received","time":"2023-04-03T09:26:54.160Z","v":0} binance-bot | {"name":"binance-api","version":"0.0.97","hostname":"d349291fcdd1","pid":45,"gitHash":"unspecified","server":"websocket","payload":{"command":"latest","authToken":"hiddenforGitHub","data":{"page":1,"searchKeyword":"","sortBy":"default","sortByDesc":false,"hideInactive":false}},"correlationId":"f3e96b4b-3020-430b-9e02-ac6c2fda4379","tag":"verifyAuthenticated","level":30,"authToken":"hiddenforGitHub","jwtSecret":"hiddenforGitHub","msg":"Verifying authentication","time":"2023-04-03T09:26:54.161Z","v":0} binance-bot | {"name":"binance-api","version":"0.0.97","hostname":"d349291fcdd1","pid":45,"gitHash":"unspecified","server":"websocket","payload":{"command":"latest","authToken":"hiddenforGitHub","data":{"page":1,"searchKeyword":"","sortBy":"default","sortByDesc":false,"hideInactive":false}},"correlationId":"f3e96b4b-3020-430b-9e02-ac6c2fda4379","tag":"verifyAuthenticated","level":30,"err":{"message":"invalid signature","name":"JsonWebTokenError","stack":"JsonWebTokenError: invalid signature\n at /srv/node_modules/jsonwebtoken/verify.js:133:19\n at getSecret (/srv/node_modules/jsonwebtoken/verify.js:90:14)\n at Object.module.exports [as verify] (/srv/node_modules/jsonwebtoken/verify.js:94:10)\n at verifyAuthenticated (/srv/dist/server.js:1:61933)\n at runMicrotasks ()\n at processTicksAndRejections (internal/process/task_queues.js:95:5)\n at async WebSocket. (/srv/dist/server.js:1:104923)"},"msg":"Failed authentication","time":"2023-04-03T09:26:54.162Z","v":0} binance-bot | {"name":"binance-api","version":"0.0.97","hostname":"d349291fcdd1","pid":45,"gitHash":"unspecified","server":"cronjob","level":30,"jobName":"trailingTradeIndicator","msg":"Task is running, skip this tick","time":"2023-04-03T09:26:55.001Z","v":0}

dasbts commented 1 year ago

So I tried again out of curiousity, letting it go for longer when doing docker-compose up before disconnecting internet. It seem the text being in green was just random and I could now see Binance data coming through again on reconnect. Thus, it seems like we could try to apply the same logic as was made in https://github.com/ViewBlock/binance-api-node/pull/49 for the frontend in, if that is possible.

So to clarify, it seems like the bot resumes after disconnect as data is coming in like we want again, but the GUI doesn't seem to understand that it has been reconnected, at least not for the Binance data. Tradingview still updating its signals after reconnect might have been an illusion, will keep monitoring.

EDIT: The Tradingview indicators and values (when expanding the TV indicators) are definitely resuming after reconnecting to internet, but nothing else is.

dasbts commented 1 year ago

I believe this might be the .js file responsible for socketing the data we are losing upon lost internet. ./app/server-binance.js This uses PubSub-js NPM library. I'm then wondering how to approach this best. PubSub doesn't seem to have for example ws.on('XYZ') which allows us to reconnect.

dasbts commented 1 year ago

I have managed to make this work now, but before making a PR and commit I will need to test a bit more as it is a very dirty solution, but the only way I see possible (at the moment at least) due to the modularity of the code. What I am doing is just utilizing the error-handler, and since we have uncaught exception when losing internet, we can just reload before sending the slack message.

However, this makes my server freeze 1-2 times for some minute due to it somehow being quite resource heavy when cleaning up all the sockets. At least if this works, this means quite a improvement in terms of safety mechanism as it would erase the requirement of making cronjobs for when the bot messes up. I think it might also be good to improve the handling by having this happen only on error messages received when losing connectivity such as "EAI_AGAIN", "connection" (for "WebSocket was closed before the connection was established"), "redlock" (as this rarely happens, but when it does, it starts doing it often), or even memory issues "ENOBUFS" and "memory". The latter if debugging and Javascript heap out of memory which can happen when debugging.

Now, I would think for this to work, it is required to not have infinite retries with Redlock, but for example 60 retries 1 second each, and if not, it would give uncaught exception for redlock not able to fully release the lock.

dasbts commented 1 year ago

Have been testing for some days and it's running stable now, previously since the v0.0.97 upgrade it broke every day sometimes multiple times either because of the server running at high capacity, or because internet was lost. So my changes which I think of making a PR for ASAP seem to have made it better, but still, it is not reconnecting to the tickers (says last update was more than a minute ago on all) after pulling the ethernet cable for more than a very brief moment.

My main changes other than updating redlock (and migrating the minor code change that was required to update) is in server-binance.js add after the await setupBinance(logger): ` process.on('unhandledRejection', err => { // we will throw it to handle it in the exception throw err; });

process.on('uncaughtException', async err => { logger.error({ err }); // For the redlock fail if (err.message.includes('quorum')) { // Simply ignore return; }

if (err.message.includes('WebSocket was closed before the connection was established')) {
  // Simply ignore
  return;
}
await new Promise(resolve => setTimeout(resolve, 60000));
await setupBinance(logger);
await Promise.all([
  runCronjob(logger)
]);

});`

Will keep digging on approach to get the frontend to also resync if it's not just a brief reconnect (which I think reconnecting-websocket is handling in the binance-api-node package) I have seen that PubSub can maybe be useful, but even when adding PubSub.cleanallsubscriptions() before it starts the setupBinance it still doesn't update upon reconnect.

I'm starting to wonder if the issue is that the /binance-trading-bot/public/js/CoinWrapperAction.js does not understand that we have a new websocket. The data is coming in as it's supposed to in terminal after reconnect.

Have also ofc. tried ws.onclose and ws.on.('close' ...) in the individual websocket files such as tickers.js, but without success.

dasbts commented 1 year ago

Also tried this without success. At connectWebSocket() { ... } I changed from const instance = new WebSocket to:

    const instance = new ReconnectingWebSocket(config.webSocketUrl, [], {
      WebSocket: WebSocket,
      connectionTimeout: 4e3,
      debug: false,
      maxReconnectionDelay: 10e3,
      maxRetries: Infinity,
      minReconnectionDelay: 4e3
    })

Also required importing it in index.html as: <script src="https://unpkg.com/reconnecting-websocket@4.4.0/dist/reconnecting-websocket-iife.min.js" crossorigin></script> Sadly didn't fix it.

dasbts commented 1 year ago

Currently testing with this approach: /binance-trading-bot/public/Index.html: <script src="https://unpkg.com/reconnecting-websocket@4.4.0/dist/reconnecting-websocket-iife.min.js" crossorigin></script> /binance-trading-bot/public/js/App.js:

  connectWebSocket() {
    const instance = new ReconnectingWebSocket(config.webSocketUrl, [], {
      WebSocket: WebSocket,
      connectionTimeout: 30e3,
      debug: false,
      maxReconnectionDelay: 30e3,
      maxRetries: Infinity,
      minReconnectionDelay: 1e4,
      maxEnqueuedMessages: Infinity
    })
    //const instance = new WebSocket(config.webSocketUrl);

    instance.onclose = () => {
      console.log('Socket is closed. Reconnect will be attempted in 1 second.');

      this.toast({
        type: 'info',
        title: 'Disconnected from the bot. Reconnecting...'
      });
      // self.setState(prevState => ({
      //   webSocket: {
      //     ...prevState.webSocket,
      //     connected: false
      //   }
      // }));

      setTimeout(function () {
        //self.connectWebSocket();
        instance.reconnect()
      }, 1000);
    };
  }

Will update if it works or not ASAP.

dasbts commented 1 year ago

Regarding the change above, it is supposed to be the other way around on the min and max values.

Either way, I couldn't make it work due to the modularity and how things are called. I did figure out tho, that the main issue is that /binance-trading-bot/public/js/CoinWrapperAction.js is not receiving any data after the connection is dropped, even if the websocket seemingly reconnects, but I have yet to find which part of the back-end is sending data to this CoinWrapperAction.js. If someone knows, please let me know and I will try on the evenings ASAP to make it retry the back-end -> front-end connection if there haven't been a change in a bit longer time (if that is even possible to achieve).

Something I did however figure out, is that /binance-trading-bot/app/error-handler.js did not have the ignores in the process.on('uncaughtException' ... ) I added the following as it was still getting errors even when ignored in the process.on('unhandledRejection' ...) and the const handleError.

if (err.message.includes('redlock')) {
  // Simply ignore
  return;
}
if (err.message.includes('quorum')) {
  // Simply ignore
  return;
}
if (err.message.includes('ENOBUFS')) {
  // Simply ignore
  return;
}
if (err.message.includes('WebSocket was closed before the connection was established')) {
  // Simply ignore
  return;
}
if (err.message.includes('connection') && err.message.includes('closed')) {
  // Your code here
  return;
}

This made it run much more smooth, especially after upgrading "ws", "binance-api-node", and "redlock" to the latest versions. However, it also led me to believe that something is causing an uncaughtException which seemingly breaks the front-end to the token websocket connection.

I did add a cronjob to restart the bot every 3 hours, but even then, I get some sockets closing after just minutes, which in my case is not that bad, but with like 50 positions that I am trying to break-even from previous crashes, some of those might miss out on opportunities to close. So it's a big problem that at absolutely random time, some symbols stop working, while others can keep on for days.

chrisleekr commented 1 year ago

@dasbts The issue you had originally was JsonWebTokenError: invalid signature

It means your browser has the wrong JWT signature, which is very unusual unless you try to manipulate it. All you need to do is just clearing the cookie.

I will close the issue for now, let me know if you have another issue.

dasbts commented 1 year ago

@chrisleekr well, it is never working after internet is lost no matter if clearing any cookie or not.

However, I have made my internet router more stable as a "solution" and cronjob to restart every 6h.

The issue still exists but I have given up after investigating the code and tried reconnecting websockets at all places where a socket is created.

chrisleekr commented 1 year ago

@dasbts Are you saying you keep getting the error JsonWebTokenError: invalid signature? Or a different error?

dasbts commented 1 year ago

I'm getting the issue that either my tickers (or most of them) and the "Action" timestamp stop working randomly up until next time I restart the bot, and now also from the tradingview signals I see the message that this data is older than 5 minutes. This is in the GUI tho, but it seem to not do any trades at all after this happens. The tradingview signals message seem to still allow selling tho, but no buy slack messages are coming after that happens.

The error you mention and the JSON I sent in the initial post is just something that shows after disconnecting the internet connection and waiting about a minute or two, then reconnecting, then sometimes I don't get any output at all other than these messages as well as something related to the mongo DB, or I get what seem like normal JSON from some ticker (usually colored in green text for some reason) but with nothing updated to the GUI.

dasbts commented 1 year ago

I have managed to check by adding a Date.now() in the miniticker at the tickers.js, and used the PubSub frontend notification to show at 2 minute interval when it was updated, and also I am seeing that it reconnects if there was no update for 10 minutes, this logic works well, but it seem to lose the frontend connection as this just stops whenever anything is sent to the error-handler.js (even if we just return/ignore).

I did also try to do this without success (or well, I saw it working by clearing the card, same for TradingView in separate approach, but it didn't help this problem):

        try {
          cache.hgetall('trailing-trade-symbols:', `trailing-trade-symbols:${monitoringSymbol}*`)
            .then(keys => {
              return Promise.all(
                Object.keys(keys).map(key => cache.hdel('trailing-trade-symbols', key))
              );
            });
        } catch {
          //
        }

I have made sure to clean when resetting the websocket with:

    intervalId[monitoringSymbol] = setInterval(() => {
      const minutesAgo = Date.now() - 10 * 60 * 1000;
      // Restart socket if 10 minutes passed since last update
      if (eventTimes[monitoringSymbol] < minutesAgo) {
        try {
          websocketTickersClean[monitoringSymbol]();
        } catch {
          //
        }
        PubSub.publish('reset-symbol-websockets', monitoringSymbol);
        console.log('Resetting websocket ' + monitoringSymbol);
        eventTimes[monitoringSymbol] = Date.now();
        try {
          clearInterval(intervalId[monitoringSymbol]);
          delete websocketTickersClean[monitoringSymbol];
          delete intervalId[monitoringSymbol];
        } catch {
          //
        }
      }
    }, 120000);
dasbts commented 1 year ago

I also added a restart with the same terminal command as in restore.sh, pkill -f node, which is available through a node imported plugin const { execSync } = require('child_process');

This allows when the following error codes happen: err.code === -1001 || err.code === -1021 || //This is recvwindow, if NTP client is correctly setup on the bot server it shouldn't give this error often, if at all. err.code === 'ECONNRESET' || err.code === 'ECONNREFUSED' || err.code === 'ENOBUFS'

ENOBUFS is memory issue error, which most of you might not have, but for me with 6 GB and zswap (definitely recommend, big performance boost on Ubuntu Debian) it triggered the same behaviour, and multiple times a day. If you encounter this, first try to set export NODE_OPTIONS="--max-old-space-size=5780" (or corresponding environment variable in Windows).

dasbts commented 1 year ago

I also had to add to mongo-state-storage.js first off a try-catch error handling for the connection itself, but also before that, do the following:

    try {
      // Close any pre-existing Mongo connections
      MongoClient.close();
    } catch (err) {
      // Ignore if already closed
    }

This is to ensure that if the whole bot restarts (which I could not do through PubSub, by the way, as it only restarted the sockets, not the frontend) when something more significant happens to the bot.

Restarting the sockets through the interval seems to work, but not always, which I'm currently testing and investigating. Hopefully, these heavier resets caused the edge cases when it continues the socket in the background (which I confirmed in the last two posts work as seen in PubSub notifications of updated timestamps and successful try of ) but not the frontend. Why I believe so is because I have observed different restarting behaviors depending on what caused it.

Still got one mystical error that I need to investigate, which started coming up after adding a try of closing the MongoDB before connecting: ReferenceError: job is not defined at process. (/srv/dist/server.js:1:92891) at process.emit (events.js:400:28) at process._fatalException (internal/process/execution.js:167:25)

Except for that last error, I believe it is way better than when I started looking into this. Will create a PR ASAP.

EDIT: LMAO, focusing so much on my daily job, obviously the error is because I copy pasted the slack code. But I'll come back with update on testing and evaluating as it likely impeded the restarts now.

dasbts commented 1 year ago

While allowing a full restart with the pkill -f node command helped A LOT, I'm still seeing that the GUI sometimes loses the connection to the backend Binance miniTicker, however, as confirmed previously, I'm getting it to reconnect in the background.

In my initial investigation I tried to use reconnecting websocket on the frontend sockets, and I think I got it to work by importing reconnecting-websockets and replacing the frontend websocket, so might look that way again. However, when I did PubSub to restart the websocket for that symbol (which was quite stupid, but I had to test it) in the front-end action timer, it actually restarted the GUI every time, but ofc. only on the page I had open, and as long as I had the GUI open.

paunandrei commented 11 months ago

@dasbts Hi, any news on this?Did you successfully tested it?