Open holgerpieta opened 3 years ago
Some more information I learned from the package dump: From what I can see, there are up to three connection active at the same time. Probably one polling connection for each of the two bots I'm running and one if messages are being sent to the bots. Usually something will happen a couple of times per second, matching the default polling frequency of 300 ms.
But: Whenever the connection reset happens, all three connections have stopped transmitting data for something like 10 minutes or longer, sometimes hours. So I guess something somewhere went wrong and the polling stopped. Then, after a while, the Telegram server closes the connection, because they idled for too long. If the telegram-bot-api then tries to send a message using the hours old connection, the server doesn't know where to put that and sends the reset. But that means that the bot wouldn't be receiving messages for a while, anyway.
So I think I have to revert my analysis of the problem: It's not that sending reuses and old long-forgotten connection, but that a polling fault is not detected and the bot stays offline without anyone noticing. But it may be that the problem is in the Node-Red Bot part, which may ignore polling fault messages and fails to trigger a reconnection. Who would you say should be responsible for detecting and correcting such a problem? The API or the code using the API?
Erm, sorry about the noise, I just found the documentation about reacting to polling errors. From what I can see this should be done already in the code using the API, but maybe something is wrong there. We'll come back when we know if the problem is in the user code or in the API.
OK, here we go again: We are by now quite sure that telegram-bot-api somehow misses the lost connection and never sends out the polling_error
message.
Any hints how we best debug this?
any news about this?
Haven't been able to debug this any further, but it is still happening with node-red-contrib-telegrambot 9.4.3, which uses node-telegram-bot-api 0.52.0. You can see the code that should handle the polling error here: https://github.com/windkh/node-red-contrib-telegrambot/blob/a771b3489fe83a1901cc0c84e18657aba1c497f2/telegrambot/99-telegrambot.js#L219 As far as I can see, this should be working, but apparently it's not, because there are no traces of it visible in the log files. So my current conclusion is that the API for some reason never detects the polling error or never sends out the polling_error message. Then of course the node red bot never has a chance to do something. If you have any ideas how to debug this any further, let me know and I'll give it a try.
What's the workaround for this? Is there a way to reestablish connection? I tried creating a new instance of bot every time I need it, but for some reason I still get this error after around 2 hours of it running.
I still see the problem every now and then:
7 Apr 04:42:58 - [warn] [telegram bot:be37e9ae.15b438] EFATAL: Error: read ECONNRESET
7 Apr 04:42:58 - [warn] [telegram bot:be37e9ae.15b438] Network connection may be down. Trying again.
But it doesn't seem to cause any problems anymore, at least I do not see any failed messages send.
Mostly the problem went away when I switched my podman containers to host-network. I had to do that for other reasons and I do not like it, but I did not find the time to fix it.
If you want to, you can try an automatic resending:
[
{
"id": "df053379.7f5398",
"type": "telegram sender",
"z": "43c781d9.ddbfa",
"name": "",
"bot": "be37e9ae.15b438",
"haserroroutput": true,
"outputs": 2,
"x": 590,
"y": 80,
"wires": [
[],
[
"82a4edf0.6199e8"
]
]
},
{
"id": "4d842a0e.6df4c4",
"type": "delay",
"z": "43c781d9.ddbfa",
"name": "",
"pauseType": "rate",
"timeout": "5",
"timeoutUnits": "seconds",
"rate": "1",
"nbRateUnits": "5",
"rateUnits": "second",
"randomFirst": "1",
"randomLast": "5",
"randomUnits": "seconds",
"drop": false,
"outputs": 1,
"x": 540,
"y": 140,
"wires": [
[
"df053379.7f5398"
]
]
},
{
"id": "82a4edf0.6199e8",
"type": "function",
"z": "43c781d9.ddbfa",
"name": "Retry",
"func": "if( msg.error ) {\n if( msg.error.includes( \"ECONNRESET\" ) ) {\n if( msg.retries ) {\n if( msg.retries > 3 ) {\n msg.my_error = \"ECONNRESET, but retries > 3, so not retrying.\";\n return [ null, msg ];\n }\n else {\n msg.retries++;\n msg.my_error = \"ECONNRESET, retrying...\";\n msg_retry = RED.util.cloneMessage( msg );\n delete msg_retry.error;\n return [ msg_retry, msg ];\n }\n }\n else {\n msg.retries = 1;\n msg.my_error = \"ECONNRESET, retrying...\";\n msg_retry = RED.util.cloneMessage( msg );\n delete msg_retry.error;\n return [ msg_retry, msg ];\n }\n }\n else {\n msg.my_error = \"Other error.\";\n return [ null, msg ];\n }\n}\nelse {\n msg.my_error = \"Telegram sent failed, but no error message.\";\n return [ null, msg ];\n}\n",
"outputs": 2,
"noerr": 0,
"initialize": "",
"finalize": "",
"x": 770,
"y": 200,
"wires": [
[
"4d842a0e.6df4c4"
],
[
"edef2391.a6805"
]
]
},
{
"id": "be37e9ae.15b438",
"type": "telegram bot",
"botname": "XXX",
"usernames": "",
"chatids": "",
"baseapiurl": "",
"updatemode": "polling",
"pollinterval": "300",
"usesocks": false,
"sockshost": "",
"socksport": "6667",
"socksusername": "anonymous",
"sockspassword": "",
"bothost": "",
"botpath": "",
"localbotport": "8443",
"publicbotport": "8443",
"privatekey": "",
"certificate": "",
"useselfsignedcertificate": false,
"sslterminated": false,
"verboselogging": true
}
]
I'm getting the same above error. In our country telegram is ban and I also attach vpn but still getting this error how to resolve? If any can solve please lemme know.... Here is the code: const TelegramBot = require("node-telegram-bot-api");
const token = "Enter_Your_Token"; const bot = new TelegramBot(token, { polling: true });
bot.on("message", (msg) => { const chatId = msg.chat.id; const messageText = msg.text;
if (messageText === "/start") { bot.sendMessage(chatId, "Welcome to the bot!"); } }); here is the error: error: [polling_error] {"code":"EFATAL","message":"EFATAL: Error: read ECONNRESET"}
The issue actually occurred in https://github.com/windkh/node-red-contrib-telegrambot and has been reported in https://github.com/windkh/node-red-contrib-telegrambot/issues/172 but connection management is not actually done there, so I think the underlying problem is somewhere in here.
A while ago I noticed that messages were randomly not send out to Telegram, the error message is ECONNRESET.
I traced the problem with tcpdump and found the following sequence of events:
So I guess the problem is that telegram-bot-api tries to reuse a connection that may be hours old and which has been forgotten by the server for ages. Either this needs to be changed to always (or at least if the old connection is too old) making a new connection or alternatively, sending must be retried at least one time after ECONNRESET. Implementing keep-alive packages (if supported by Telegram) might also be a solution.