Open ianmilligan1 opened 6 years ago
c5.large
downloading data: began Dec 18 17:17, finished Dec 19 08:57. 909GB.
Data handed off to i3.2xlarge
, begins derivative processing Dec 19 10:20. 909GB.
Uploading to S3 buckets is quick from the EBS volume (~ 5 minutes to upload all derivatives).
Trial #2
c5.large
downloading data: began Dec 21 16:28, finished Dec 23 12:49am. 3.5TB.
Data handed off to r3.2xlarge
, begins derivative processing Dec 21 12:58am.
I then killed this.
Note: this collection failed before on Compute Canada, so we'll see how it shakes out.
I'm happy with the resource utilization on this machine:
Trial #3
I did this on an Azure machine, 16 core, 55 GB RAM. 206GB. Canadian Federal Political Candidates collection. Previously failed on WALK machine.
Trial #4
I did this on an Azure machine, 16 core, 55 GB RAM. 293GB. Toronto Mayoral Election 2015 collection. Previously failed on WALK machine.
has a 7GB WARC. Going to try increasing partition numbers.
Issue to hold data on time trials.