"Too many open files"-error

nullprobe commented 11 years ago

I've been stresstesting my haystack to see how much http connections a honeyd instance can manage. For this I use the httperf tool with the following example command :

httperf --server XXX.XXX.XXX.XXX --port 80 --rate 130 --num-conn 10000 --num-call 1 --timeout 5

For now I can get to 130 connections per second (--rate 130) at ~45% CPU load but with a greater rate I'm starting to get the following errors in the Honeyd.log.

Mar 12 14:36:22 xxxx honeyd[30648]: malloc (xxx.xxx.xxx.xxx:54173 - xxx.xxx.xxx.xxx:80): Too many open files Mar 12 14:36:22 xxxx honeyd[30648]: malloc (xxx.xxx.xxx.xxx:54174 - xxx.xxx.xxx.xxx:80): Too many open files Mar 12 14:36:22 xxxx honeyd[30648]: malloc (xxx.xxx.xxx.xxx:54175 - xxx.xxx.xxx.xxx:80): Too many open files

Is this a configurable value somewhere in the Honeyd code? I'm thinking the machine can take on much more connections but there is a value somewhere limiting this.

PherricOxide commented 11 years ago

Looks like it's a Linux thing, scroll down to the "Increase Open FD Limit at Linux OS Level" in this link: http://www.cyberciti.biz/faq/linux-unix-nginx-too-many-open-files/

My Ubuntu system has,

pherricoxide@midigaurd:~$ ulimit -Hn
4096
pherricoxide@midigaurd:~$ ulimit -Sn
1024

PherricOxide commented 11 years ago

@nullprobe did changing that fix the problem? Or is it still an issue?

nullprobe commented 11 years ago

The ulimit command doesn't seem to change anything and I've tried everything it says in the link you provided. I also tried playing with AppArmor setting but no luck yet... Any ideas?

nullprobe commented 11 years ago

Ulimit is now set correctly and I'm not getting the erorr any more but honeyd is easily getting DOSed by the hping3 command. I'm getting these kind of messages :

Libpcap has dropped 1128538 packets. Try increasing the capture buffer.

Any way I can configure this? Or is that a bad idea?

PherricOxide commented 11 years ago

@nullprobe you can configure that in the advanced setting for Quasar. In current version, Settings -> Configure Advanced Settings -> Packet Capture Buffer Size (in bytes).

It's not a bad idea to make it bigger, but what's probably happening is that the hardware is too slow to process the packets at that speed, and it really doesn't matter how big the buffer is, you'll end up filling it up and dropping packets. What the buffer does do is under normal conditions buffer up spikes in traffic and process them when the traffic rate has decreased enough that Nova can start to catch up (for instance, an nmap scan will cause a big traffic spike, but then traffic to honeypots will be relatively minimal so it has time to process it all. However, doing nmap scans repeatedly might not give it enough time to catch up and it'll start dropping packets. Not the end of the world, hostile suspect still usually appear hostile even if a few of their packets are dropped.)

Out of curiosity, what sort of hardware are you running it on (CPU and memory)?

nullprobe commented 11 years ago

Right now the buffer is set to 1048576, I guess this is the default value. Any suggestions as to how high I can put this with the following hardware :

CPU : Intel® Xeon® Processor LV 5148 Memory : 4GB Disk :

       description: SCSI Disk
       product: MegaRAID SAS RMB
       vendor: LSI
       physical id: 2.0.0
       bus info: scsi@4:2.0.0
       logical name: /dev/sda
       version: 1.03
       serial: 00dc0e1e641967c718f0ffffff5f0003
       size: 339GiB (364GB)
       capabilities: partitioned partitioned:dos
       configuration: ansiversion=5 signature=000089c5

PherricOxide commented 11 years ago

The default is 10MB. You could try changing it to something like 128MB (128 * 2^20 = 134217728 bytes) and see if that helps. You might see a small throughput hit because of the gratuitous memory allocation and paging, but it shouldn't be too bad unless you used up all of your 4GB of memory and started going to swap (I'd probably keep the buffer size less than 1GB).

nullprobe commented 11 years ago

OK, I'm going to test this setup. Any recommendations on the number of worker threads?

PherricOxide commented 11 years ago

That setting's not used very much anymore. The default should be fine, it won't really change performance.

nullprobe commented 11 years ago

So yesterday I flooded on of the haystack IP's with hping3 connects and congested the server. Did this for about 5min, stopped the flooding and watched how nova would deal with the aftermath. Now, a day later, nova hasn't released any of the memory allocated during the flood and even quasar is really slow.

PherricOxide commented 11 years ago

Which process is using the memory (honeyd, novad or node (Quasar))? You can use top and sort by memory for a rough estimate.

If it's node, you might not have the fix from https://github.com/DataSoft/Nova/issues/796. If it's novad... huh. Might be a memory leak worth opening an issue about.

nullprobe commented 11 years ago

It's the novad process. Want me to run some memory leak checks?

PherricOxide commented 11 years ago

Sure. If you're familiar with Valgrind or similar tools, feel free to take a look and let us know of your findings.

DataSoft / Honeyd

"Too many open files"-error #70