pusher / pusher-http-python

Pusher Channels HTTP API library for Python
https://pusher.com/docs/server_api_guide
MIT License
375 stars 112 forks source link

Too much data before UTF-8 encoding #204

Closed igor-wl closed 1 year ago

igor-wl commented 1 year ago

I'm trying to push an 8KB message to pusher and while it works for ascii characters it breaks if for example there is even a single "🦄" char in the dictionary because now sys.getsizeof returns size of the string object as 34KB.

Considering that the final dictionary is encoded to UTF-8 before sending to pusher I think that the length check should also be performed on the UTF-8 encoded string and not on the python object.

The issue is with this code:

if sys.getsizeof(data) > 30720:
    raise ValueError("Too much data")

Small demo how much more bytes is consumed for python object compared to UTF-8 string. image

benjamin-tang-pusher commented 1 year ago

Hi, I was able to send that character from my Python backend.

Screenshot 2023-07-10 at 18 20 23

Are you using the latest version of our library? We made a change where ensure_ascii=False at https://github.com/pusher/pusher-http-python/blob/239d67b7a047a18ee181922c0a1461ceaf7c565f/pusher/util.py#L113

utf-8 characters passed to our library remains as-is (before we accidentally escaped unicode characters so the size ballooned, but this shouldn't be the case anymore)

benjamin-tang-pusher commented 1 year ago

Closing due to no response.