citp / news-disinformation-study

A research project on how web users consume, are exposed to, and share news online.
8 stars 2 forks source link

Compile a dataset of news social media accounts #29

Closed jonathanmayer closed 3 years ago

jonathanmayer commented 4 years ago

We need this as input for the SocialMediaAccountExposure study module.

A possible direction... for each domain in our dataset where we don't have a matching URL on a social media platform, run a Google search with a query like "site:socialmediaplatform.com news.com", and parse the profile URL for the first N results. If there's a match, assume that's the profile for that news website on that social media platform.

Possibly helpful: https://github.com/sherlock-project/sherlock

jonathanmayer commented 4 years ago

From today's discussion: we should also think about whether it's possible to extract social media usernames from news websites. For example, a news website might include links to social media accounts, or might include social media metadata (e.g., Twitter card tags or Facebook Open Graph markup).

jonathanmayer commented 3 years ago

Closing this out. @benjaminhkaiser has been on top of maintaining our social media account datasets.