ipfs-inactive / archives

[ARCHIVED] Repo to coordinate archival efforts with IPFS
https://awesome.ipfs.io/datasets
183 stars 24 forks source link

Twitter Datasets on IPFS #192

Open algarecu opened 5 years ago

algarecu commented 5 years ago

Hi there, is there any interest in adding large datasets to IPFS to encourage research and innovation with the support of your platform? We are looking at this here at GeoDB and using IPFS for PoC.

These two datasets presumably contain ground-truth of state-sponsored information operations as already known from the JTRIG). Data collection for this type of dataset has been a long standing issue and in my own PhD thesis I had to crawl, collect and annotate a Twitter dataset, with the consequent limitations for research outputs. However, now having a copy of this in IPFS can be of great value and even encourage future research using IPFS as a permanent data source(e.g., increase dataset availability through replication, resilience to take downs, etc.).

https://about.twitter.com/en_us/values/elections-integrity.html#data

Internet Research Agency

Dataset readme Account information Tweet information (1.24GB) Media (296GB, 302 archives)

Iran

Dataset readme Account information Tweet information (168MB) Media (65.7GB, 52 archives)