slalom-ggp / covid-response-dataops

Template project for COVID-19 response data
MIT License
0 stars 1 forks source link

Create glue job to transform Twitter json to CSV #6

Open aaronsteers opened 4 years ago

aaronsteers commented 4 years ago

Will want to trim to essential columns.

aaronsteers commented 4 years ago

@SamMSamuelson - If you have a list of columns needed/wanted, can you post here?

SamMSamuelson commented 4 years ago

For Twitter I'm thinking we want:

  1. Geo loc. - only want relevant Colorado tweets. By any miracle, is this the FIPS code?
  2. Timestamp – sense of timing of the concern/comment
  3. Twitter handle - to understand duplicates & such
  4. Tweet Text – Must mention ‘COVID’ or 'corona' or 'virus' or ? (any other triggers?)
  5. Retweet count – sense of how popular this sentiment is
  6. Like count - sense of how popular this sentiment is
nelson-wolf commented 4 years ago

@SamMSamuelson - this list sounds great to me, I can work on this. Maybe another trigger in the tweet text could be something related to "government" or "governor" so they can track how the population thinks they're doing in responding?