sumitmukhija / Sentiment-analysis-study

0 stars 0 forks source link

MoMs #4

Open sumitmukhija opened 4 years ago

sumitmukhija commented 4 years ago
  1. Agenda - Compare Civilian and Veteran Twitter Data
    1. Comparison of adjectives/tweets amongst civilians and soldiers
    2. Attribute comparison - length of tweet, frequency of tweets, word count
    3. followers & following
    4. Retweets and replies
    5. personality types - 5 major personalities - sub research

  2. Learnings -
    1. Literature research should be refined.
    2. Pre processing should be improved

  3. Data collection
    1. Veteran data - Iawa
      1. Profiles (usernames)
      2. Bios - filter for bios which have words like soldier, army, veteran ***
      3. Conditionally: If I can get gender - have a representative dataset
      4. 200 of them
      5. Get all the tweets of these 200
    2. Civilian data -
      1. Profiles (username)
      2. Bios - filter for bios which do not have words like soldier, army, vetera
      3. Conditionally: If I can get gender - have a representative dataset
      4. 200 of them
      5. Get all the tweets of these 200
sumitmukhija commented 4 years ago

cat <your current file.csv> | tr -d '\015' > <new file name>.csvto get rid of the special seq