bear / python-twitter

A Python wrapper around the Twitter API.
Apache License 2.0
3.41k stars 955 forks source link

calc_expected_status_length incorrect when URL is terminated by disallowed character #672

Open colegleason opened 4 years ago

colegleason commented 4 years ago

I'm trying to use calc_expected_status_length as a means to snippet long text, but the count seems off by 1 character and I can't figure out why. Twitter says the below text is one character over the limit, while calc_expected_status_length returns 280.

>>> from twitter.twitter_utils import calc_expected_status_length as calc
>>> tweet = """I embedded a description: "From text recognition: $1000/month from your Github Repo! Inbox x Sam gitads.io> 5:27 PM (46 minutes ago) to me - Hey there, I'm getting in touch ..."
Full text at accessible.lol/tweet/1275462624109625345
What are image descriptions? https://help.twitter.com/en/using-twitter/picture-descriptions"""
>>> calc(tweet)
280

Any idea what part might throw off the count?

EDIT: As noted below, this happens when a URL ends in a character that Twitter doesn't allow. The calc_expected_status_length assumes that URLS can only be terminated by whitespace characters, but some unsafe ones like ">" should also be included.

colegleason commented 4 years ago

The issue is the ">" after the URL in "gitads.io". It seems that Twitter counts the ">" as a character while this library includes that in the count for the URL. Maybe the regex could be modified to catch this?