Useless lseek() and stat() calls -> bad performance on NFS shares

logstash-plugins / logstash-output-file

Apache License 2.0

23 stars 53 forks source link

Useless lseek() and stat() calls -> bad performance on NFS shares #54

Open micoq opened 7 years ago

micoq commented 7 years ago

Hi,

Currently, the files are opened in "append+read" mode ("a+"). However, this mode generate many lseek() calls (one call after each write/flush). This can decrease a lot the writing speed when the files you are writing are on a NFS (v4) share (in my case, on a NetApp storage controller with "slow" SATA disks).

Is the "append+read" mode necessary contrary to the "append only" mode ("a") ? In "append only", the lseek() calls disappear and I have a much better writing speed (700 event/s -> 1600 events/s).

I tested the gzip compression as well without any problem. As far I know, the gzip format does never need to re-read the file being compressed.

However, this problem doesn't appear on all NFS share/storage controllers (probably because of better disks, more cache memory...)

https://github.com/logstash-plugins/logstash-output-file/blob/master/lib/logstash/outputs/file.rb#L278

Operating system : RHEL 7

trenb commented 6 years ago

I've observed the same thing with logstash newer than 1.x. With Logstash 6.0, switching from "append+read" to just "append" fixes the high IOwait we were seeing, and the ceiling of about 700 events/s when writing to a NetApp NFS share. In our testing with just append mode, I've gotten over 2000 events/s.

Why was the mode changed after logstash 1.x?

micoq commented 6 years ago

I've found another similar problem with the stat() syscall:

For each event, a test is made to know if the destination still exists (compared to the cache):

if !deleted?(path) && cached?(path)
      return @files[path]
end

However the method deleted?() calls File.exist?() which calls stat(). On an NFS mount point, each stat() call is translated to GETATTR.

If I remove this statement, the throughput increases from 4000 ev/s to 20000 ev/s (10000ev/s on 2 Logstash on 2 servers writing to the same NFS share)

Operating System : RHEL 7 Logstash : 6.3.0