Science-for-Nature-and-People / soc-twitter

SNAPP - Soil Organic Carbon Twitter data
1 stars 4 forks source link

Fix automate.R Dropping Duplicates with Less RT/Like Counts (Complete) #33

Closed remyknox closed 5 years ago

remyknox commented 5 years ago

Started

automate.R uses distinct() function which drops the duplicate tweets following the first tweet in descending order (which is oldest tweets to newest tweets).

remyknox commented 5 years ago

Updating automate.R on my own branch.

remyknox commented 5 years ago

Issue fixed by checking new data frame (query from twitter) with old df and removing old tweets from old df before merging new df. Verified codes works on test df's. Not using distinct(). Using base R. RT/Like counts will now be most up-to-date tweets when looking at DUPLICATES.

Complete