nchammas / flintrock

A command-line tool for launching Apache Spark clusters.
Apache License 2.0
638 stars 116 forks source link

Configuring cluster to increase the max number of files that can be open (needed by Spark during large shuffles) #148

Open jbherman opened 8 years ago

jbherman commented 8 years ago

i saw you encountered this problem during big shuffles. here is how i fixed it. hope it helps. Note: this is on amazon linux.

1) copy os's sysctl.conf 2) append "fs.file-max = 100000" to sysctl.conf 3) copy-file bigCluster /Users/jason/projects-misc/sysctl.conf /home/ec2-user/sysctl.conf 4) run-command bigCluster "sudo cp /home/ec2-user/sysctl.conf /etc/sysctl.conf" 5) run-command bigCluster "sudo shutdown -r now" 6) cat < limits.conf

thanks for your great work. -jason

nchammas commented 8 years ago

Hi Jason, and thank you for the kind words!

Is this in reference to this comment?

jbherman commented 8 years ago

yes it's a reference to that comment