DeveloperLiberationFront / Spreadsheet-Corpus-Paper

1 stars 0 forks source link

Where to Host Datasets? #19

Open CaptainEmerson opened 10 years ago

CaptainEmerson commented 10 years ago

We'll need a webserver to put the datasets on. We might be able to get something at NCSU, which I'll do as a backup, but a longer term solution would be better. It looks like we'll need something on the order of a terrabyte.

Amazon AWS public data sets would probably be good. http://aws.amazon.com/public-data-sets/ I can submit the form and make the case to Amazon, but I've got no idea how the mechanics of uploading work (S3? EBS?) @barik, if they approve, do you know how to do this?

Other ideas? @Felienne @kjlubick @slankas ?

barik commented 10 years ago

Yes, I know how to do this. Actually, the open source data set is already on S3, since we used Amazon EC2 clusters to download it all in the first place: s3://barik-xls/g.

CaptainEmerson commented 10 years ago

Made request to Amazon.