pallets / quart

An async Python micro framework for building web applications.
https://quart.palletsprojects.com
MIT License
2.97k stars 160 forks source link

Websockets not removed when using iphone/ipad #64

Closed kllr closed 5 years ago

kllr commented 5 years ago

Hi,

I'm using your great code to see what clients (websockets) are online. This works great for desktop pc's. But when a user uses an iphone/ipad, the websockets arent removed from the 'connected' list when they close the safari browser.

Any suggestions how i can solve this? Can i iterate over the connected set and check the connectionStatus of the websockets?

I looked into my logfile and see that when a desktop exits: websocket._get_current_object() returns : <quart.wrappers.request.Websocket object at 0x7f2514216898>

and when an iphone/ipad exits, it looks like the wrapper function ( async def wrapper(*args, **kwargs) ) isn't called at all...

The used code:

from functools import wraps

connected = set()

def collect_websocket(func):
    @wraps(func)
    async def wrapper(*args, **kwargs):
        global connected
        connected.add(websocket._get_current_object())
        try:
            return await func(*args, **kwargs)
        finally:
            connected.remove(websocket._get_current_object())
    return wrapper                                                                                                                                                                                                            

@app.websocket('/ws')                                                                                                                                                                                       
@collect_websocket
async def ws():

kind regards

pgjones commented 5 years ago

Interesting, do you know if it happens with Safari on a desktop? Or with different browsers on the iphone/ipad?

kllr commented 5 years ago

I checked: a android tab with chrome (just to be shure): works ok.

ipad with: safari: does not work Firefox: does not work Chrome: does not work frameless browser: does not work

PC: Safari (5.1.7): Couldnt get safari to open my site (https://www.makemesmile.nl/) Firefox: works ok chrome: works ok

kllr commented 5 years ago

Hi Phil,

Do you know if there are things I can try to solve this? I've got dozens of ipad's live right now and we don't know if there really still online or not.

I was thinking about:

Maybe you have a suggestion for me?

Thanks!

pgjones commented 5 years ago

I've done a lot of work on the WebSocket code in Hypercorn recently, you could try the Hypercorn master branch and see if that solves this problem. This could be an issue with how the close handshake is carried out, which the master branch is less sensitive too.

kllr commented 5 years ago

Hi Phil,

Sounds very good. I’m on holiday this week but i’ll let you know next week. Is it an official release version already? (Is simple upgrade enough?)

Kind regards, Steven

Verstuurd vanaf mijn iPhone

Op 7 jul. 2019 om 00:52 heeft Phil Jones notifications@github.com het volgende geschreven:

I've done a lot of work on the WebSocket code in Hypercorn recently, you could try the Hypercorn master branch and see if that solves this problem. This could be an issue with how the close handshake is carried out, which the master branch is less sensitive too.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

pgjones commented 5 years ago

I hope to release tomorrow, if things go well.

kllr commented 5 years ago

Hi,

I tried it today. This is what happened:

The websocket connects well like before. But there is an error when i close the page on a iPad.

This is the part of the log file

12:55:53,988 urbanGUI INFO websocket: {'msg_type': 'client_connect', 'survey_id': 'XXXXXXXXX-497f-45b1-b58b-94e56d34e45e'}
12:57:25,798 asyncio ERROR Task exception was never retrieved
future: <Task finished coro=<ASGIHTTPConnection.handle_request() done, defined at /home/ubuntu/venv/makemesmileenv/lib/python3.7/site-packages/quart/asgi.py:58> exception=AssertionError()>
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/asyncio/tasks.py", line 219, in __step
    result = coro.send(None)
  File "/home/ubuntu/venv/makemesmileenv/lib/python3.7/site-packages/quart/asgi.py", line 61, in handle_request
    await asyncio.wait_for(self._send_response(send, response), timeout=response.timeout)
  File "/usr/local/lib/python3.7/asyncio/tasks.py", line 384, in wait_for
    return await fut
  File "/home/ubuntu/venv/makemesmileenv/lib/python3.7/site-packages/quart/asgi.py", line 77, in _send_response
    'more_body': True,
  File "/home/ubuntu/venv/makemesmileenv/lib/python3.7/site-packages/hypercorn/protocol/http_stream.py", line 129, in app_send
    status_code=int(self.response["status"]),
  File "/home/ubuntu/venv/makemesmileenv/lib/python3.7/site-packages/hypercorn/protocol/h2.py", line 104, in stream_send
    await self._flush()
  File "/home/ubuntu/venv/makemesmileenv/lib/python3.7/site-packages/hypercorn/protocol/h2.py", line 150, in _flush
    await self.send(RawData(data=data))
  File "/home/ubuntu/venv/makemesmileenv/lib/python3.7/site-packages/hypercorn/asyncio/server.py", line 103, in protocol_send
    await self.writer.drain()
  File "/usr/local/lib/python3.7/asyncio/streams.py", line 348, in drain
    await self._protocol._drain_helper()
  File "/usr/local/lib/python3.7/asyncio/streams.py", line 206, in _drain_helper
    assert waiter is None or waiter.cancelled()
AssertionError

This is the part of my main.py file with all the websocket things:

import quart.flask_patch
from flask_login import LoginManager, UserMixin, login_required, login_user, logout_user, current_user
from quart import Quart, websocket, session, redirect, request, render_template, Response, url_for, flash, send_file, safe_join
from quart.static import send_from_directory
from hypercorn.middleware import HTTPToHTTPSRedirectMiddleware
from functools import wraps
import asyncio
import traceback
import pandas as pd
from datetime import datetime
from helpers import SurveyTuple, OrderStatus, ProductType
from passlib.hash import pbkdf2_sha256
import hashlib

import datetime, io, os, json
import time
import constants
from payments import Payment
from vote import Vote, VoteDbOperations
from order import Order, OrderDbOperations
from order_item import OrderItem, OrderItemDbOperations
from app_user import User, UserDbOperations
from database import Database
from logger import Logger
from statistics import Statistics
from survey import Survey, SurveyDbOperations
from product import Product, ProductDbOperations
from PIL import Image
from password_reset import PasswordResetLink, PasswordResetLinkMethods
from email_confirmation import EmailConfirmationLink, EmailConfirmationMethods
import psycopg2
# from PIL import ImageFont
# from PIL import ImageDraw
import raven
import mollie.api.client
import mollie.api.error
from emailer import Emailer, Emailer2
import jsonpickle
import uuid

import pandas as pd

from vote import Vote, VoteDbOperations

app = Quart(__name__)
login_manager = LoginManager()

connected = set()  # list of all connected websocket clients (Terminals)

# config
app.config.update(
    DEBUG=constants.DEBUG_APPLICATION,
    SECRET_KEY='keyXXXXXXX'
)
login_manager.init_app(app)
login_manager.login_view = "login"

# redirect all http to https
redirected_app = HTTPToHTTPSRedirectMiddleware(app, host="www.makemesmile.nl")

#
#           Needed for the websockets
#

async def receiving():
    while True:
        data = await websocket.receive()
        # logger.error(str(data))

def collect_websocket(func):
    @wraps(func)
    async def wrapper(*args, **kwargs):
        """This decorator when used to wrap a websocket handler, will add and remove websockets from the connected set.
        The _get_current_object method of the websocket is required to get the websocket in the current context,
        and the try-finally is required to ensure the websocket is removed regardless of any errors that are raised.
        Note the app.websocket must wrap (be before) the collect_websocket usage."""

        global connected
        data = await websocket.receive()
        data_json = json.loads(data)

        logger = Logger().config()
        logger.error('websocket: {}'.format(str(data_json)))

        # await check_for_dead_websockets()

        if data_json['msg_type'] == 'client_connect':
            if "survey_id" in data_json:
                new_client = Client(data_json['survey_id'], False, websocket._get_current_object())
                connected.add(new_client)
                # ConnectionState.CLOSED
        try:
            return await func(*args, **kwargs)
        except Exception as e:
            # for connect in connected.copy():
            #     if connect.ws == websocket._get_current_object():
            #         connected.remove(connect)
            #         break
            raise
            # logger.error('EXCEPT EXCEPT---- Going to remove Websocket!---- : {}'.format(str(websocket._get_current_object())))
        finally:
            logger.error('--- Going to remove Websocket!---- : {}'.format(str(websocket._get_current_object())))
            for connect in connected.copy():
                if connect.ws == websocket._get_current_object():
                    connected.remove(connect)
                    break

    return wrapper

@app.websocket('/ws')
@collect_websocket
async def ws():
    consumer = asyncio.create_task(receiving())
    while True:
        await asyncio.gather(consumer)

class Client:
    def __init__(self, survey_id, dashboard, my_websocket):
        self.survey_id = survey_id
        self.am_i_dashboard = dashboard  
        self.ws = my_websocket
        self.time = time.time()
        # self.alive = True
        # self.future = GraceFuture()

@login_manager.user_loader
def user_loader(user_name):
    # 1. Fetch against the database a user by `id`
    # 2. Create a new object of `User` class and return it.
    my_app_user = UserDbOperations()
    found_id = my_app_user.get_identifier(user_name)
    user = None
    if found_id is not None:
        user = User()
        user.id = found_id
        user.user_name = user_name
    return user

@login_manager.needs_refresh_handler
def refresh():
    # do stuff
    return redirect(url_for('login'))

# somewhere to logout
@app.route("/logout")
@login_required
async def logout():
    logout_user()
    return await render_template('landing-page.html')

# handle login failed
@app.errorhandler(404)
async def page_not_found(e):
    # await flash(str(e))
    return await render_template('404.html')

@login_manager.unauthorized_handler
async def unauthorized_handler():
    return redirect(url_for('login'))

@app.route('/static/images/backgrounds/<path:filename>', methods=['GET','POST'])
async def send_image(filename):
    if request.method == 'GET':
        return await send_from_directory('./static/images/backgrounds/', filename)

    request_file = request.files['data_file']
    if not request_file:
        return None

@app.route('/monitor_all', methods=['GET', 'POST'])
# @login_required
async def monitor_all():
    all_clients = []
    if request.method == 'GET':
        survey_db_ops = SurveyDbOperations()
        for client in connected.copy():
            found_survey = await survey_db_ops.get(client.survey_id,None, False)
            all_clients.append(found_survey)
        return await render_template('monitoring.html', surveys=all_clients)

if __name__ == "__main__":
    if constants.LOCAL:
        app.run(port=443)

Below are all the package versions:

Package           Version
----------------- ---------
aiofiles          0.4.0
asyncio           3.4.3
blinker           1.4
certifi           2019.6.16
chardet           3.0.4
Click             7.0
Flask             1.0.2
Flask-Login       0.4.1
h11               0.8.1
h2                3.1.0
hpack             3.0.0
Hypercorn         0.7.0
hyperframe        5.2.0
idna              2.8
itsdangerous      1.1.0
Jinja2            2.10
jsonpickle        1.2
MarkupSafe        1.1.0
mollie-api-python 2.1.3
multidict         4.5.2
numpy             1.15.4
pandas            0.23.4
passlib           1.7.1
Pillow            6.0.0
pip               19.1.1
psycopg2          2.8.3
python-dateutil   2.7.5
pytoml            0.1.20
pytz              2018.7
Quart             0.9.1
raven             6.10.0
requests          2.22.0
setuptools        39.0.1
six               1.12.0
sortedcontainers  2.1.0
toml              0.10.0
typing-extensions 3.6.6
urllib3           1.25.3
Werkzeug          0.14.1
wsproto           0.14.1
XlsxWriter        1.1.8

If you need more info to see what's going wrong, i'll send you the info.

kind regards,

kllr commented 5 years ago

There maybe another issue as well: (sorry that i put it in the other issue, i can delete it if you want)

On Hypercorn 0.7.0 the site doesn't load. When i stop the loading, i see some resources aren't loaded: jquery, the material css and some images.

Degrading back to 0.6.0 the site is loading like it should.

I added two screenshots

Naamloos-2 Naamloos-1

pgjones commented 5 years ago

The latter problem is proving hard to track down, it looks to be HTTP/2 related from your screenshot. Are there any other clues?

kllr commented 5 years ago

The latter problem is proving hard to track down, it looks to be HTTP/2 related from your screenshot. Are there any other clues?

I gave it multiple runs, on pc, mac and ipad with Hypercorn 0.7.0 and 0.6.0. All the 0.7.0 runs give this behaviour. With 0.6.0 the problem occured on the ipad once.

Would it help you when i send you the html/js files etc?

pgjones commented 5 years ago

I think the latest master fixes the WebSocket problem - if you could confirm that would be great :).

The HTTP/2 problem will take some more investigation. Thanks for the offer for the html/js but I don't think it would help.

pgjones commented 5 years ago

@kllr I think this is an aside to this issue, but makemesmile.nl is accessible over http. Are you serving the app rather than the redirected_app?

kllr commented 5 years ago

Hi,

You're right. I was serving the app instead of the redirected app. I changed it.

page load I also changed the version of hypercorn to the most recent master branch version. It is running this right now so you can see how it is responding at the moment.

When you open a new private browser window and load makemesmile.nl the site does not load. Interrupting the process by pressing cancel button (X in top left of firefox), and after that reload the page in the same private window, the page loads correctly.

websockets Websockets still work but the disconnecting of an iPad isn't recognized.

pgjones commented 5 years ago

Hmm, I've released 0.7.1 now with some further fixes. Thanks for leaving the branch up, I also see the failure to load - I can't figure out why though.

I'm still unsure about the HTTP/2 (page load) issue and the iPad issue - I might be able to borrow an iPad.

kllr commented 5 years ago

Hi, i'll give the 0.7.1 a go. If it works i'll leave it up, if not than i'll revert to 0.6.0 (tomorrow the users are using the system)

If you're not able to borrow an iPad, then i can mail you one that you can borrow, no problem. I have a old one that i'm not using.

pgjones commented 5 years ago

Hello, could you try 0.7.1 live again tomorrow (Sun)? I've got a new HTTP/2 diagnostic tool to try.

kllr commented 5 years ago

Hello, could you try 0.7.1 live again tomorrow (Sun)? I've got a new HTTP/2 diagnostic tool to try.

hi, it is live. I hope you can find something.

pgjones commented 5 years ago

I thinking now both of these issues are related to the usage of the flask_patch system (it doesn't seem to propagate cancellations). Could you confirm you are using it in production? Also is it possible to run your site without it?

kllr commented 5 years ago

hi, Yes i'm using the quart.flask_patch. This was necessary because i'm using flask_login. Is there an alternative for flask_login that i could try?

I'll try to make a bare python main file to run it without flask patch and let you know if it runs the html and the websocket part.

kllr commented 5 years ago

i removed flask_patch and flask_login temporarily from the main python file.

The landing page still does not run, the html loading remains :(

As for the websockets, the behaviour is also still the same. Closing the browser window in win10 is recognized, from iphone it isn't

kllr commented 5 years ago

For testing purposes i changed the main page by another html/js template. That loads without any problems.

After that, I tried adding the js files of the production template to the new template and the file that hangs it all is (adding this also hangs the new template): <script src="../static/js/bootstrap-material-design.min.js"></script>

I don't know why it hangs hypercorn but it does.

This file below needs the bootstrap-material-design js, so i can't exclude it from hanging hypercorn. <script src="../static/js/material-kit.min40a0.js?v=2.0.2"></script>

pgjones commented 5 years ago

I've got hold of an iPad and I think I can explain what is happening. In Firefox when the tab is closed the browser sends the WebSocket close handshake, which triggers the closure in Quart. In Safari on the iPad this handshake is not sent, instead the socket is just closed. When a socket is closed Hypercorn only realises when it tries to write to the socket, once it does though the standard closure is triggered in Quart.

I think in your case the Safari websockets will be removed from the connected list the next time you try and write to them. I would imagine then that things are working well in production i.e. no leak of resources only it is harder to debug.

kllr commented 5 years ago

hi, great!

So the solution for me would be sending a 'ping' message for example to all the sockets once every x-hours? This so an error would occur and the socket is removed from the connected list.

Thanks for the great help and support! I'll try to implement it the next few days.

Thanks, i'll close this error report.

pgjones commented 5 years ago

That would work, it wouldn't error on sending though, rather the task would be Cancelled (which is an error of sorts).

Does HTTP/2 work as well now?

kllr commented 5 years ago

HTTP/2 doesn't work. The one file is still hanging it all. For now i'll be stuck at version 0.6.0.

pgjones commented 5 years ago

Regarding the HTTP/2 issue, could you give the priority branch a try and see if it persists?

kllr commented 5 years ago

i tried to install it on the aws server but it says that module priority is missing. I can't find it in the package. Am i doing something wrong? (I copied the priority files over the hypercorn 0.7.2 files using scp)

On my local machine 0.7.2 works btw. I didn't try that before.

pgjones commented 5 years ago

You would also need to install priority using pip (or similar). I'll merge into the master branch today, if that helps. Thanks for testing.

On my local machine 0.7.2 works

This is interesting, also helps explain why I've not managed to reproduce this. I'm not sure where to go next, I'm thinking I should add lots of debug logging to Hypercorn and then see if we can catch the failure in the logs.

kllr commented 5 years ago

Hi,

Very good news! It works 👍! Local and on AWS EC2. The priority version is online on makemesmile.nl and working.

Thanks for all the help and great support :-D

pgjones commented 5 years ago

Woo hoo 🎉 . I've a little more work to do then I'll release Hypercorn 0.8. Even better news is that your users should experience quicker page loads with the priority version.

kllr commented 5 years ago

Indeed woo hoo :). Faster page loads sounds great. I upgraded my AWS to a faster server lately because of the page loads. I'll check the 0.8 when it's there and maybe even we can downgrade the server a bit.

pgjones commented 5 years ago

Are you using the uvloop worker (pip install uvloop or pip install hypercorn[uvloop] && hypercorn ... -k uvloop)? It is usually quicker and more CPU efficient.