fivethirtyeight / russian-troll-tweets

768 stars 214 forks source link

Version 2 (without rebase) #29

Closed EvanCarroll closed 6 years ago

EvanCarroll commented 6 years ago

Look at this one which is the vanilla version. If this is acceptable, I will rebase this (merging it down to one commit). This will reduce the load on the repo moving forward by gigs compared to this version here without the rebase. We will lose commit history.

If rewriting the repo is acceptable, perhaps we should just publish a version 2.0 and mark this whole repo as deprecated and move people over to that. In a version 2.0 repo changes would be text-only, and we wouldn't be carrying around gigs of compressed files and bad commits.

Remember removing a file from the head of a tree does not delete it, all of old commits are still here.

This version has the PostgreSQL schema, loader, and dumper and improves on prior version 2.0 by removing duplicates.

dmil commented 6 years ago

Thanks. I've just posted the latest version of the data (#28) that @patrick-lee-warren sent us.