codalab / codabench

Codabench is a flexible, easy-to-use and reproducible benchmarking platform. Check our paper at Patterns Cell Press https://hubs.li/Q01fwRWB0
Apache License 2.0
62 stars 25 forks source link

Django error in production server #1322

Open Didayolo opened 7 months ago

Didayolo commented 7 months ago

In the production server, we regularly get the following error. It may slow down the performance of the platform.

codabench-django-1 | [2024-02-10 15:25:31 +0000] [43] [ERROR] Exception in ASGI application
codabench-django-1 | Traceback (most recent call last):
codabench-django-1 |   File "/usr/local/lib/python3.8/site-packages/uvicorn/protocols/websockets/websockets_impl.py", line 157, in run_asgi
codabench-django-1 |     result = await self.app(self.scope, self.asgi_receive, self.asgi_send)
codabench-django-1 |   File "/usr/local/lib/python3.8/site-packages/uvicorn/middleware/proxy_headers.py", line 45, in __call__
codabench-django-1 |     return await self.app(scope, receive, send)
codabench-django-1 |   File "/usr/local/lib/python3.8/site-packages/uvicorn/middleware/asgi2.py", line 7, in __call__
codabench-django-1 |     await instance(receive, send)
codabench-django-1 |   File "/usr/local/lib/python3.8/site-packages/channels/sessions.py", line 183, in __call__
codabench-django-1 |     return await self.inner(receive, self.send)
codabench-django-1 |   File "/usr/local/lib/python3.8/site-packages/channels/middleware.py", line 41, in coroutine_call
codabench-django-1 |     await inner_instance(receive, send)
codabench-django-1 |   File "/usr/local/lib/python3.8/site-packages/channels/consumer.py", line 58, in __call__
codabench-django-1 |     await await_many_dispatch(
codabench-django-1 |   File "/usr/local/lib/python3.8/site-packages/channels/utils.py", line 51, in await_many_dispatch
codabench-django-1 |     await dispatch(result)
codabench-django-1 |   File "/usr/local/lib/python3.8/site-packages/channels/consumer.py", line 73, in dispatch
codabench-django-1 |     await handler(message)
codabench-django-1 |   File "/usr/local/lib/python3.8/site-packages/channels/generic/websocket.py", line 240, in websocket_disconnect
codabench-django-1 |     await self.disconnect(message["code"])
codabench-django-1 |   File "/app/src/apps/competitions/consumers.py", line 65, in disconnect
codabench-django-1 |     await self.close()
codabench-django-1 |   File "/usr/local/lib/python3.8/site-packages/channels/generic/websocket.py", line 226, in close
codabench-django-1 |     await super().send({"type": "websocket.close"})
codabench-django-1 |   File "/usr/local/lib/python3.8/site-packages/channels/consumer.py", line 81, in send
codabench-django-1 |     await self.base_send(message)
codabench-django-1 |   File "/usr/local/lib/python3.8/site-packages/channels/sessions.py", line 236, in send
codabench-django-1 |     return await self.real_send(message)
codabench-django-1 |   File "/usr/local/lib/python3.8/site-packages/uvicorn/protocols/websockets/websockets_impl.py", line 234, in asgi_send
codabench-django-1 |     raise RuntimeError(msg % message_type)
codabench-django-1 | RuntimeError: Unexpected ASGI message 'websocket.close', after sending 'websocket.close'.
Didayolo commented 7 months ago

@bbearce

Regarding the loading of the public benchmarks list, it still seems quite slow. The "icon logos" did improve it, but I am thinking it may be related to something else (the way the competition are retrieved)?

Sorry, I am a bit merging several topics here, but it would be interesting to discuss generally the speed of loading of different pages of the platform.

ihsaan-ullah commented 7 months ago

@Didayolo codabench overall has become slower. You can compare the speed by opening codalab and codabench competition pages together and codalab loads faster

Didayolo commented 7 months ago

We absolutely need to solve the performance problem. Competition pages are too slow to load (loading everything at once).

New issue about this:

Didayolo commented 4 months ago

Still the case:

codabench-django-1         | Traceback (most recent call last):
codabench-django-1         |   File "/usr/local/lib/python3.8/site-packages/uvicorn/protocols/websockets/websockets_impl.py", line 157, in run_asgi
codabench-django-1         |     result = await self.app(self.scope, self.asgi_receive, self.asgi_send)
codabench-django-1         |   File "/usr/local/lib/python3.8/site-packages/uvicorn/middleware/proxy_headers.py", line 45, in __call__
codabench-django-1         |     return await self.app(scope, receive, send)
codabench-django-1         |   File "/usr/local/lib/python3.8/site-packages/uvicorn/middleware/asgi2.py", line 7, in __call__
codabench-django-1         |     await instance(receive, send)
codabench-django-1         |   File "/usr/local/lib/python3.8/site-packages/channels/sessions.py", line 183, in __call__
codabench-django-1         |     return await self.inner(receive, self.send)
codabench-django-1         |   File "/usr/local/lib/python3.8/site-packages/channels/middleware.py", line 41, in coroutine_call
codabench-django-1         |     await inner_instance(receive, send)
codabench-django-1         |   File "/usr/local/lib/python3.8/site-packages/channels/consumer.py", line 58, in __call__
codabench-django-1         |     await await_many_dispatch(
codabench-django-1         |   File "/usr/local/lib/python3.8/site-packages/channels/utils.py", line 51, in await_many_dispatch
codabench-django-1         |     await dispatch(result)
codabench-django-1         |   File "/usr/local/lib/python3.8/site-packages/channels/consumer.py", line 73, in dispatch
codabench-django-1         |     await handler(message)
codabench-django-1         |   File "/usr/local/lib/python3.8/site-packages/channels/generic/websocket.py", line 240, in websocket_disconnect
codabench-django-1         |     await self.disconnect(message["code"])
codabench-django-1         |   File "/app/src/apps/competitions/consumers.py", line 65, in disconnect
codabench-django-1         |     await self.close()
codabench-django-1         |   File "/usr/local/lib/python3.8/site-packages/channels/generic/websocket.py", line 226, in close
codabench-django-1         |     await super().send({"type": "websocket.close"})
codabench-django-1         |   File "/usr/local/lib/python3.8/site-packages/channels/consumer.py", line 81, in send
codabench-django-1         |     await self.base_send(message)
codabench-django-1         |   File "/usr/local/lib/python3.8/site-packages/channels/sessions.py", line 236, in send
codabench-django-1         |     return await self.real_send(message)
codabench-django-1         |   File "/usr/local/lib/python3.8/site-packages/uvicorn/protocols/websockets/websockets_impl.py", line 234, in asgi_send
codabench-django-1         |     raise RuntimeError(msg % message_type)
codabench-django-1         | RuntimeError: Unexpected ASGI message 'websocket.close', after sending 'websocket.close'
bbearce commented 1 month ago

Ran into this on my own and found this works, but not sure it is "best":

"codabench/src/apps/competitions/consumers.py"


class SubmissionOutputConsumer(AsyncWebsocketConsumer):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self._closed = False  # Flag to track if the connection is already closed

    async def connect(self):
        if not self.scope["user"].is_authenticated:
            return await self.close()

        await self.accept()
        await self.channel_layer.group_add(f"submission_listening_{self.scope['user'].pk}", self.channel_name)

    async def disconnect(self, close_code):
        if not self._closed:  # Check if the connection is already closed
            self._closed = True
            await self.channel_layer.group_discard(f"submission_listening_{self.scope['user'].pk}", self.channel_name)
            print(f"\n\n\nclose_code: {close_code}\n\n")
            await self.close()

    def group_send(self, text, submission_id, full_text=False):
        return self.channel_layer.group_send(f"submission_listening_{self.scope['user'].pk}", {
            'type': 'submission.message',
            'text': text,
            'submission_id': submission_id,
            'full_text': full_text,
        })

    async def receive(self, text_data=None, bytes_data=None):
        """We expect to receive a message at this endpoint containing the ID(s) of submissions to get
        details about; typically on page load, looking up the previous submission details"""
        data = json.loads(text_data)

        submission_ids = data.get("submission_ids", [])

        if submission_ids:
            # Filter out submissions not by this user
            submissions = Submission.objects.filter(id__in=submission_ids, owner=self.scope["user"])
            con = get_redis_connection("default")

            for sub in submissions:
                text = (con.get(f':1:submission-{sub.id}-log'))
                if text:
                    await self.group_send(text.decode('utf-8'), sub.id, full_text=True)

    async def submission_message(self, event):
        data = {
            "type": "catchup" if event.get('full_text') else "message",
            "submission_id": event['submission_id'],
            "data": event['text']
        }
        await self.send(json.dumps(data))
bbearce commented 1 month ago

ChatGPT on the above method:

You're right; using a flag like _closed can feel like a workaround rather than a clean solution. Ideally, you want to structure your code so that await self.close() is only called once per connection lifecycle, without relying on such flags. However, sometimes in asynchronous programming, particularly with complex state management in WebSockets, you might need a guard like this to prevent race conditions or multiple calls to close.

Why the Issue Occurs: The RuntimeError you're seeing happens because the ASGI protocol expects a certain sequence of events, and sending multiple websocket.close messages violates that sequence. Channels handles these events internally, but if your code logic inadvertently triggers multiple close calls, you end up with this error.

Is it the "Right" Way? While the _closed flag solution does work, it's more about ensuring robustness against multiple invocations of await self.close(). In certain cases, such as complex consumer logic or when multiple coroutines might trigger the disconnect sequence, this pattern is practical and sometimes necessary.

Best Practices: Keep disconnect Simple: Ensure that your disconnect method is straightforward and only handles cleanup tasks directly related to the disconnection. Avoid complex logic or multiple calls to await self.close() within disconnect.

Refactor Logic: If possible, refactor your code to ensure await self.close() is only triggered under well-defined conditions. For instance, consider restructuring the flow in your consumer to make it clear when and why the connection should be closed.

Error Handling: Use error handling (try...except) to capture unexpected issues and ensure that the disconnect sequence is handled gracefully.

ASGI Specifications: Familiarize yourself with the ASGI lifecycle to understand the expected order of events. This understanding can help you avoid scenarios where the ASGI application state is inconsistent, leading to errors like the one you're experiencing.

Conclusion: While using a _closed flag is not inherently wrong, it's a sign that your code may need additional checks or restructuring to ensure it's following the expected WebSocket protocol. If you can achieve the desired behavior without the flag, that’s preferable. However, in cases where race conditions or complex asynchronous behavior are involved, such guards can be practical.

bbearce commented 1 month ago

I think it is still happening so this may be a lost cause..