bellingcat / auto-archiver

Automatically archive links to videos, images, and social media content from Google Sheets (and more).
https://pypi.org/project/auto-archiver/
MIT License
578 stars 60 forks source link

x.com urls make work like twitter.com #104

Closed djhmateer closed 1 year ago

djhmateer commented 1 year ago

I've tested using the API with this code change to allow x.com urls:

will do a PR... placeholder issue.

# twitter_archiver.py

link_pattern2 = re.compile(r"x.com\/(?:\#!\/)?(\w+)\/status(?:es)?\/(\d+)")
...

def get_username_tweet_id(self, url):
    # detect twitter.com URLs that we definitely cannot handle
    matches = self.link_pattern.findall(url)
    # twitter.com
    if not len(matches): 
        # maybe it is an x.com url?
        matches = self.link_pattern2.findall(url)

        if not len(matches): return False, False

    username, tweet_id = matches[0]  # only one URL supported
    logger.debug(f"Found {username=} and {tweet_id=} in {url=}")

    return username, tweet_id
GalenReich commented 1 year ago

Hi Dave, thanks for opening this issue. I think that this should have been solved in ddb9dc8 (corresponds to release v0.6.12) which tweaked the link_pattern regex. Would you be able to check?

msramalho commented 1 year ago

Indeed this has been fixed and released, closing the issue.