nchammas / flintrock

A command-line tool for launching Apache Spark clusters.
Apache License 2.0
637 stars 116 forks source link

Flintrock should try to raise ulimit for open files/connections #194

Closed douglaz closed 7 years ago

douglaz commented 7 years ago

We got "Too many open files" on HDFS when using a big machine (x1.32xlarge I think). Flintrock should try to raise the ulimits to avoid such problems.

2017-02-25 06:05:24,008 INFO [DataXceiver for client DFSClient_attempt_201702250604_0001_m_000777_0_590913881_319 at /172.24.34.247:44830 [Receiving block BP-1011058414-172.24.37.193-1487995665469:blk_1073744657_3833]] datanode.DataNode (BlockReceiver.java:receiveBlock(934))

  • Exception for BP-1011058414-172.24.37.193-1487995665469:blk_1073744657_3833 java.io.IOException: Too many open files at sun.nio.ch.IOUtil.makePipe(Native Method) at sun.nio.ch.EPollSelectorImpl.(EPollSelectorImpl.java:65) at sun.nio.ch.EPollSelectorProvider.openSelector(EPollSelectorProvider.java:36) at org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.get(SocketIOWithTimeout.java:409) at org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:325) at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:157) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131) at java.io.BufferedInputStream.read1(BufferedInputStream.java:284) at java.io.BufferedInputStream.read(BufferedInputStream.java:345) at java.io.DataInputStream.read(DataInputStream.java:149) at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:199) at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:213) at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:171) at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:501) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:895) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:801) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:253) at java.lang.Thread.run(Thread.java:745) 2017-02-25 06:05:24,008 INFO [DataXceiver for client DFSClient_attempt_201702250604_0001_m_000618_0_396527967_321 at /172.24.32.100:39380 [Receiving block BP-1011058414-172.24.37.193-1487995665469:blk_1073744854_4030]] datanode.DataNode (BlockReceiver.java:receiveBlock(934)) - Exception for BP-1011058414-172.24.37.193-1487995665469:blk_1073744854_4030 java.io.IOException: Too many open files
nchammas commented 7 years ago

Dup of #148?

douglaz commented 7 years ago

@nchammas it seems so