ahoward / lockfile

a ruby library for creating NFS safe lockfiles
55 stars 19 forks source link

(deleted) open files under /proc/<pid>/fd lead to 'to many open files' error #11

Open mikisvaz opened 10 years ago

mikisvaz commented 10 years ago

A pool of processes competing for locks accumulate open file descriptors for deleted files. They show up under /proc/pid/fd as for instance:

19 -> /data/mvazquezg.home/.rbbt/tmp/tsv_open_locks/share>databases>interactome3d>interactions_tsv.lock (deleted)

Long running processes adquiring thousands of locks will consume all open file descriptors.

Does someone have any clue?

I'm trying to make a simple example that reproduces this issue

mikisvaz commented 10 years ago

I closed it accidentally. Sorry.

What i meant to say is that:

Lockfile.refresh = false

seems to alleviate greatly the problem. Though a few tmp lockfiles (.hostname*.lck) still pop-up

I suspect that when the owner removes the lock while another process is reading it, it may end up unclosed. Which will explain how removing the refresh avoids the problem, though not completely since tmp lockfiles are still being read. Just a thought

mikisvaz commented 10 years ago

BTW, this could very well be a problem on my end, my infrastructure is quite complex.

mikisvaz commented 10 years ago

UPDATE:

This code reproduces the error; after a little while you should see the number of deleted files grow

require 'lockfile'

cpus = 2
file = "/tmp/test.lock"

pids = []
cpus.times do
  pid = Process.fork do
    while true do
      Lockfile.new file do
        Lockfile.new file + '.1' do
        end
      end
      pid = Process.pid

      txt = `ls -la /proc/#{pid}/fd |grep deleted`
      puts([pid, txt.split("\n").length] * ": ")
    end
  end
  pids << pid
end

Process.waitpid
pids.each{|p| Process.kill :INT, p}
mikisvaz commented 10 years ago

UPDATE:

It seems that this removes the problem entirely:

Lockfile.dont_use_lock_id = true
dukebd711 commented 9 years ago

Just an FYI: I hit the "too many files open" error after upgrading to ruby 2.2.2. I was previously on 1.8.7 and didn't get these errors. Setting dont_use_lock_id to true solved it for me. Thanks.

mikisvaz commented 9 years ago

Yeah, that solves it. However you loose some features, so its not ideal