obdurodon / dh_course

Digital Humanities course site
GNU General Public License v3.0
20 stars 6 forks source link

McDowell Research Project #389

Closed nmcdowell00 closed 4 years ago

nmcdowell00 commented 4 years ago

Over the years, Twitter has become increasingly important for the spread of important information and news. The platform, which was once home to solely memes and banter ,has become a common place for some of the world’s most powerful figures to voice their opinions. In the past few years one of the most prolific tweeters has been Donald Trump. Trump’s twitter has been a source of great controversy since the start of his presidency in 2016. Donald Trump’s presence on Twitter over the past 4 years is probably one of the more notable aspects of his presidency. For my research project I would like to go back through Trump’s tweets over the past 4 years and create a system of indicators and markup that can be used to illustrate a change in the tone or content of his tweets. To do this I would like to pick a few tweets from specific days in each month over the past few years. There are simply too many tweets to go through all that were made during his presidency.
An example of markup used to analyze the tweet: `

Mini Mike Bloomberg is playing poker with his foolhardy and unsuspecting Democrat rivals. He says that if he loses (he really means when!) in the primaries, he will spend money helping whoever the Democrat nominee is. By doing this, he figures, they won’t hit him as hard....

`

The aim of this research project would be to highlight the insulting and immature ways in which a man in power acts on a social media platform. My markup will hopefully be able to show a meaningful change in the tweets over time; hopefully, revealing trends that link to the events that coincide with when the tweets were posted.

djbpitt commented 4 years ago

@nmcdowell00 Twitter supports an API (application programming interface) that can be used to scrape tweets from their database. The terms change frequently, so I don’t know off hand whether you can get a free account that would let you single out Trump’s tweets, but you can start exploring the options and resources at https://developer.twitter.com/en/docs. The point of using an API for scraping is that you don't have to copy and paste the text manually from an interface designed for human readers; the API returns data in a form designed for machine processing. If you can check out what sorts of access Twitter provides, we can advise you about the scraping process.

nmcdowell00 commented 4 years ago

I have actually found a site that has archived Trump's tweets already. The site is http://www.trumptwitterarchive.com/archive.

djbpitt commented 4 years ago

@nmcdowell00 Cool! Citizen science!

The export formats seem to be CSV or JSON, and we’ll be happy to work with you to convert those to XML if you're not already familiar with the process.

sjw82 commented 4 years ago

@nmcdowell00 This is such an interesting proposal! I just have a few things that come to mind when I read it.

  1. How exactly are you going to narrow your corpus? You say it will be limited to specific days in each month but how will those days be determined? If you just say the third Thursday of every month, what will it mean for your project to exclude specifically major events? For example, what would it mean for your data if your corpus didn't include the day of the impeachment? I don't have an answer for that, nor do I have the perfect idea of how you should select your corpus but you might consider using all of the tweets from the days where he tweeted most or least frequently; all of the tweets from a period (a week, a month, three months, something) at the very beginning, exact middle, and current point in the presidency to see change over time; only tweets that @ other people; only tweets that deal with specific themes or topics as indicated by keywords; tweets from regularly occurring dates (The 3-9th of every month or the last week of every month) as you originally said; or something entirely different. How you select your corpus has a massive impact on the data you collect and, perhaps more importantly, how that data can be used by others. Consider the purpose of your project when deciding on a corpus.
  2. You might consider the ethics of a research project that stems from an opinion. What does it affect or does it matter at all if a researcher begins a project with a goal in place of a hypothesis? You say that the project is meant to highlight certain characteristics of the president—what is the implied audience in that case?

Overall, great work here!