Closed the-jolley-boy closed 1 year ago
In the HTTP Request code dpy uses, you can see that any 2xx response code is valid and works, allowing it to continue.
You can also see that we handle 5xx error codes with a exponential backoff retry (scaling the time between retries up). The 4xx error codes are raised properly.
If your hunch is correct, then it would be because of a 5xx error code, in which we retry it with a very small delay (1, 2, 4, 8, 16 seconds). If it's 4xx (an error), we return that error.
Your try/except here is catching a bare exception and then will print merely the error, not the traceback which can hamper debugging, please alleviate that and send any traceback you do get here.
If there were wider reports of the way that DMing users with errors happen, we'd hear about it a lot more than a 1-3 times per day incident which makes me hesitant to believe this is the fault and I suspect the issue exists in your code.
Without seeing more real code we can't advise too much further. If you want to test more, you can try running with debug logging for a few hours until one of these issues hits, and send the full log to a paste service and then to here (or directly upload here, your choice).
More than happy to help with sufficient evidence to go on.
To be honest the code here is a little wild.. why are you storing user#discrim
in the database?
You should instead store their IDs as the postgresql BIGINT
type.
This means when you fetch that data back you can use Guild.get_member()
which is much faster than utils.find()
.
Your logging.info("Opened database successfully")
is misleading as this logs before any connection is even attempted. If you relied on this it would lead you on a wild goose chase.
You have a check
function that appears unused...?
As for the User.send
, I'll need a traceback to advise further!
Yeah some of the code I just copied from old function and just haven't gotten rid of it. I was also just looking to change to guild.get_member() actually instead. ill send the updated function. As for the id in the db it was originally like that to see the name of the member that did the survey but of course I can easily get that. the db is already setup with the username so I don't think I will change that.
Usernames and discrims can change, whereas if you store the ID this data is always get
able or fetch
able on demand FROM the id.
I am aware of that yes
So the bot failed again but there was no issue with regards to what was initially in question.
This issue is my code has been working flawlessly for years and this most recent iteration of the code was working without fail for over 3 months. How is it that it seemingly fails around the time my server introduces the new discord feature that they are rolling out.
I highly doubt my code is causing this issue and in fact it is some issue with discord.py and the new discord feature updates. There is no way that code working for month fails due to itself without some library failures not in my control.
To add it is seemingly dying for no reason at all, there is nothing in the audit log between restarting it and it failing again and no log of any of the function in the program being called. It has to be some issue related to discord and discord.py
Which library version are you on currently?
print(discord.__version__)
within Python or python -m discord -v
from your shell will tell you.
There was a crash bug semi-recently and was fixed with v2.2.1, if you exist before that then that could be the issue.
However I will note that most crash bugs do have a traceback involved so your logging/error solution seems to be suppressing things.
I'm using discord.py v2.2.2-final but I had the issue before and I was using some older version then updated to see if that would fix it. There is a traceback when I force stop the bot when I see that the bot is offline, but it's overall useless and says nothing about the issue. I doubt the log is really suppressing anything. Typically this is the error I get.
In the CMD: `--- Logging error --- Traceback (most recent call last): File "C:\Users\Administrator\AppData\Local\Programs\Python\Python310\lib\site-packages\discord\gateway.py", line 175, in run f.result(10) File "C:\Users\Administrator\AppData\Local\Programs\Python\Python310\lib\concurrent\futures_base.py", line 447, in result raise TimeoutError() concurrent.futures._base.TimeoutError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python310\lib\logging__init.py", line 1098, in emit
msg = self.format(record)
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python310\lib\logging__init.py", line 942, in format
return fmt.format(record)
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python310\lib\logging__init.py", line 678, in format
record.message = record.getMessage()
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python310\lib\logging\init.py", line 368, in getMessage
msg = msg % self.args
TypeError: not enough arguments for format string
Call stack:
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python310\lib\threading.py", line 966, in _bootstrap
self._bootstrap_inner()
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python310\lib\threading.py", line 1009, in _bootstrap_inner
self.run()
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python310\lib\site-packages\discord\gateway.py", line 186, in run
_log.warning(msg, self.shard_id, total)
Message: 'Shard ID %s heartbeat blocked for more than %s seconds.\nLoop thread traceback (most recent call last):\n File "C:\Users\Administrator\main.py", line 2336, in
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python310\lib\logging__init.py", line 1098, in emit
msg = self.format(record)
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python310\lib\logging__init.py", line 942, in format
return fmt.format(record)
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python310\lib\logging__init.py", line 678, in format
record.message = record.getMessage()
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python310\lib\logging\init.py", line 368, in getMessage
msg = msg % self.args
TypeError: not enough arguments for format string
Call stack:
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python310\lib\threading.py", line 966, in _bootstrap
self._bootstrap_inner()
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python310\lib\threading.py", line 1009, in _bootstrap_inner
self.run()
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python310\lib\site-packages\discord\gateway.py", line 186, in run
_log.warning(msg, self.shard_id, total)
Message: 'Shard ID %s heartbeat blocked for more than %s seconds.\nLoop thread traceback (most recent call last):\n File "C:\Users\Administrator\main.py", line 2336, in
In the Log:
[ERROR] - Attempting a reconnect in 0.94s Traceback (most recent call last): File "C:\Users\Administrator\AppData\Local\Programs\Python\Python310\lib\site-packages\discord\client.py", line 659, in connect await self.ws.poll_event() File "C:\Users\Administrator\AppData\Local\Programs\Python\Python310\lib\site-packages\discord\gateway.py", line 646, in poll_event raise ConnectionClosed(self.socket, shard_id=self.shard_id, code=code) from None discord.errors.ConnectionClosed: Shard ID None WebSocket closed with 1000
The error log there is interesting to me, this definitely is our heartbeat logging but it's... broken? You're blocking the bot's heartbeat, likely with blocking code, and that's generating useless error logs here. You should try and resolve this heartbeat issue - but it's not the issue here.
You're getting disconnected and reconnected from the websocket, which can indicate more blocking code, which actually may be the cause here.
You're free to setup debug logging for a time and send the output of that here.
Added note: that logging traceback shouldn't exist, it's expecting two arguments and getting two arguments. Weird.
I am just confused on how my code could be blocking. I look at the server logs and the only function that could have been used during one period where it went offline was the on_message() which would have only triggered but not completed anything since there were no messages in the monitored channels.
I've also made sure my code is 100% non blocking all the libraries I use and such are setup with async.
I will setup debug logging and see but given I use much of the same code in other bots and never have issues it's just very odd that my code would have anything blocking. What debugger do you suggest? And will my current log get in the way?
This issue is just very very odd.
There are a couple of things here that don't match expectations verifiable in the code from discord.py here, I'm trying to look for any potential source of interference
some additional info may help diagnose this.
python -m discord -v
python -m pip freeze
@mikeshardmind
`class Client(discord.Client): def init(self): super().init(intents = intents) self.synced = False asyncio.set_event_loop_policy( asyncio.WindowsSelectorEventLoopPolicy() )
async def on_ready(self):
await self.wait_until_ready()
if not self.synced:
await client.sync()
self.synced = True
print(f'{self.user} has connected to Discord!')`
The issue with the bot happened before I the event loop policy but I needed that to make psycopg asynchronous.
2.
Not to be a stickler, but can you paste your entire bot file, or upload it (both without the token, and only the token removed).
Aside from issues in which psycopg
is not great, and your environment is polluted with a lot of packages there isn't enough to go on here.
Also: code blocks are in the following format:- ```py # code here ```
First and foremost I must say the overall design of this code is... questionable at best. The amount of nested functions is a bit baffling and poor design.
Secondly, psycopg2
on Windows is notably bad. It's known for numerous event loop issues.
Thirdly I do see quite a lot of blocking code, things like csv.Writer()
and the like. This could be contributing to the blocking going on.
I do not however see anything that would cause the library to fall over, so I expect there are other issues going on here. I've been running 2.2.2 for quite some time now and outside of user-code faults I've had nothing like this, and I'm in servers with all of the latest features. No reproduction possible.
Please change your logging level to debug until it dies again, this should show us what is going on.
Thanks for the follow-up with your environment info. That ruled out the ProactorEventLoop + psycopg issue, as well as a few known dependencies that have caused issues by modifying another package's namespace before, but introduces a few other questions that don't have clear answers. As it stands, there's still what appears to be an impossible log line as the complaint is about the number of arguments, while logging that it in fact has the correct number of arguments. other people on 2.2.2 have had this log emitted correctly.
@mikeshardmind yeah I am setting my logger to debug and will reply here when it fails again. Thanks.
It just went offline and it logged absolutely nothing, last time anything logged was when the bot started. Well I guess besides the message that was generated once I interrupted it in the CMD, but we already saw this error before.
[ERROR] - Attempting a reconnect in 0.43s Traceback (most recent call last): File "C:\Users\Administrator\AppData\Local\Programs\Python\Python310\lib\site-packages\discord\client.py", line 659, in connect await self.ws.poll_event() File "C:\Users\Administrator\AppData\Local\Programs\Python\Python310\lib\site-packages\discord\gateway.py", line 646, in poll_event raise ConnectionClosed(self.socket, shard_id=self.shard_id, code=code) from None discord.errors.ConnectionClosed: Shard ID None WebSocket closed with 1000 08-Apr-23 01:07:35 - [Line:550 - Function:received_message In:gateway.py From:discord.gateway]
If anyone is curious, it seems to have just been some issue with the cli running the script and it was causing the bot to fail which was very odd since I have ran scripts for months without an issue like this. Simple server restart seemed to fix the issue.
Summary
I think there may be a bug either on discords end or discord.py. I recently had the server my bot is in upgraded and since I get heartbeat issues that I believe to originate there. I had the code running 0 issues for 3+ months until the change.
Reproduction Steps
I don't really have any reproduceable steps sadly, it is very niche. I have tried to reproduce the issue myself but cannot do it. I have tried multiple accounts, privacy settings for dm sending and all and cannot reproduce but the heartbeat failing has lead me to believe this is the cause.
I will attach the code in question.
I have added way too many prints to try to figure out the issue, I just added the prints so I need to wait for the error to flare up again in order to exactly pinpoint but typically it fails 1-3 times per day atm.
I will make sure to update with any more info.
Minimal Reproducible Code
Expected Results
It is supposed to either send a dm or fail and just print to console. It also will print if the ticket was not registered in my database but that doesn't really have any chance of being the reason of the failures.
Actual Results
Sometimes it works and sometimes it doesn't. My hunch is it has something to do with some permission given by members with regards to sending a dm and for some reason that attempt gets hung up when it should except and fail, my thought is some change on discords end may not be compatible with how discord.py does it at the moment. I have tried to reproduce with different settings for privacy and none have given me anything reproduceable.
Intents
intents = discord.Intents().all()
System Information
discord.py 2.2.2
Checklist
Additional Context
No response