EleutherAI / the-pile

MIT License
1.5k stars 128 forks source link

Enron Emails #18

Closed StellaAthena closed 4 years ago

StellaAthena commented 4 years ago

Official description:

This dataset was collected and prepared by the CALO Project (A Cognitive Assistant that Learns and Organizes). It contains data from about 150 users, mostly senior management of Enron, organized into folders. The corpus contains a total of about 0.5M messages. This data was originally made public, and posted to the web, by the Federal Energy Regulatory Commission during its investigation.

Project URL: https://www.cs.cmu.edu/~./enron/