Closed jonathanmayer closed 3 years ago
From today's discussion: we should also think about whether it's possible to extract social media usernames from news websites. For example, a news website might include links to social media accounts, or might include social media metadata (e.g., Twitter card tags or Facebook Open Graph markup).
Closing this out. @benjaminhkaiser has been on top of maintaining our social media account datasets.
We need this as input for the SocialMediaAccountExposure study module.
A possible direction... for each domain in our dataset where we don't have a matching URL on a social media platform, run a Google search with a query like "site:socialmediaplatform.com news.com", and parse the profile URL for the first N results. If there's a match, assume that's the profile for that news website on that social media platform.
Possibly helpful: https://github.com/sherlock-project/sherlock