nexy7574 / nio-bot

Matrix bot library based on matrix-nio, similar in design to discord.py
https://docs.nio-bot.dev/
GNU Lesser General Public License v3.0
23 stars 5 forks source link

[BUG] sync loop silently swallowing errors in event callbacks #14

Closed nexy7574 closed 1 month ago

nexy7574 commented 6 months ago

Describe the bug According to logs, the finally clause of NioBot.run is being executed, despite my bot not being told to exit.

To Reproduce The bug appears to be related to this code - https://github.com/nexy7574/philip/blob/6c9a20396ceb80f2230c3e8f6864a3a20e347bd0/modules/discord_bridge.py#L655-L667

                if not avatar:
                    self.log.debug("Fetching %s avatar from matrix.", message.sender)
                    profile = await self.bot.get_profile(message.sender)
                    if isinstance(profile, nio.ProfileGetResponse):
                        if profile.avatar_url:
                            self.log.debug("Fetching avatar from %s", profile.avatar_url)
                            avatar = await self.bot.mxc_to_http(profile.avatar_url)
                        else:
                            self.log.debug("No avatar found.")
                    else:
                        self.log.warning("Failed to fetch profile for %s", message.sender)
                else:
                    self.log.debug("Already have an avatar")

This is what I get in my logs when this code executes:

2024-02-19 11:34:29 INFO httpx HTTP Request: GET http://localhost:3762/bridge/bind/eek4015:matrix.org "HTTP/1.1 404 Not Found"
2024-02-19 11:34:29 DEBUG philip.discord_bridge No bound account for eek4015:matrix.org
2024-02-19 11:34:29 DEBUG philip.discord_bridge No bound discord account for @eek4015:matrix.org
2024-02-19 11:34:29 DEBUG httpx load_ssl_context verify=True cert=None trust_env=True http2=False
2024-02-19 11:34:29 DEBUG httpx load_verify_locations cafile='/home/nex/Bots/philip/venv/lib/python3.11/site-packages/certifi/cacert.pem'
2024-02-19 11:34:29 DEBUG philip.discord_bridge Have a registered webhook URL. Using it.
2024-02-19 11:34:29 INFO niobot.client Closing http session and logging out.

This is strange, because there should be at least one more DEBUG statement from the avatar resolution code, yet there isn't?

At first, I thought somehow python was confusing httpx.AsyncClient closing after exiting a context manager. But after further debugging, the discord account binds should not be able to be printed if this were the case. I suspect that somehow, the try/finally is swallowing errors. But wait, try/finally should just propagate the error? More accurately, I believe the error is being swallowed elsewhere, with the try/finally being nothing more than a symptom. However, I am unsure where.

Expected behavior Any exceptions that are raised in callbacks (even if not owned by us) should at the very least log the exception, ideally be captured an not bubble up far enough to crash the process.

Tracebacks (if applicable) Contextually, unavailable. However, these are my logs after commenting out the avatar code blocks:

2024-02-19 11:37:28 DEBUG philip.discord_bridge Have a registered webhook URL. Using it.
2024-02-19 11:37:28 DEBUG philip.discord_bridge Preparing body
2024-02-19 11:37:28 DEBUG philip.discord_bridge Body: {'content': 'OMG', 'username': '@eek4015:matrix.org', 'allowed_mentions': {'parse': ['users'], 'replied_user': True}}
2024-02-19 11:37:28 DEBUG philip.discord_bridge Sending message to discord.

and the rest of the code executes successfully.

Additional context

nexy7574 commented 6 months ago

Upon further investigation, it does appear to be errors being raised in Nio event callbacks. After wrapping the entire callback in a try/except, I can catch the error and log it.

However, this does imply the library is doing something to capture and swallow the error.

nexy7574 commented 6 months ago

It does appear to be the try/finally - throwing in a try/except/finally properly re-raises the captured error. May be worth implementing.

nexy7574 commented 1 month ago

Cannot reproduce this anymore.