Closed zebra-ok closed 6 years ago
I believe tweets have an an indeces
property somewhere inside media
objects that you can use to strip these out. I actually wrote this functionality in python for another project
def replace_str_indeces(text, indeces):
return '{}{}'.format(text[:indeces[0]], text[indeces[1]+1:])
def strip_txt_urls(tweet):
"""
remove urls + media links from the main txt body
eg: Lol!!!! https://twimg.com/thing_23131.jpg => Lol!!!!
"""
media = tweet.get('entities', {}).get('media', [])
urls = tweet.get('entities', {}).get('urls', [])
for media_str in (media + urls):
tweet['full_text'] = replace_str_indeces(tweet['full_text'], media_str['indices'])
return tweet['full_text']
this should alleviate the media picture links https://github.com/mannynotfound/react-tweet/commit/a6141ddc6ff130b3804a3f0b1e5c8f7ac2c9e353
I guess media urls and quote tweet urls should be hidden in tweet text...somehow
twitter-text
includes those in the AutoLinkWithJSON method....