Closed upintheairsheep closed 1 year ago
Can you integrate the ConvoKit datasets, especially the giant Reddit dataset into the pile, or a future version of the pile? I would really would like to bring AI further for all of humanity, not for the purpose of feeding the pigs (cooperations). https://zissou.infosci.cornell.edu/convokit/datasets/ See https://convokit.cornell.edu/documentation/datasets.html
http://cairo.lti.cs.cmu.edu/~hector/ - A similar dataset hosting ~0.5GB of Twitter tweets, ~0.3 GB dbpedia data and an unknown amount of wikihow xml files
pile v2
Can you integrate the ConvoKit datasets, especially the giant Reddit dataset into the pile, or a future version of the pile? I would really would like to bring AI further for all of humanity, not for the purpose of feeding the pigs (cooperations). https://zissou.infosci.cornell.edu/convokit/datasets/ See https://convokit.cornell.edu/documentation/datasets.html