PagerDuty / blender

A modular orchestration engine
https://github.com/PagerDuty/blender
Other
183 stars 8 forks source link

using flock in tmp dir may yields unexpected results #20

Closed ohadlevy closed 9 years ago

ohadlevy commented 10 years ago

Hi,

while trying it out, I just create a simple /tmp/exampe.rb and executed: USER=root blend -f /tmp/example.rb

and got the following trace:

3 job(s) computed using 'Default' strategy
Run failed (0.000161113 s)

blender/lib/blender/lock/flock.rb:48:in `release': undefined method `flock' for nil:NilClass (NoMethodError)
from blender/lib/blender/lock/flock.rb:60:in `ensure in with_lock'
from blender/lib/blender/lock/flock.rb:60:in `with_lock'
from blender/lib/blender/scheduler/dsl.rb:150:in `lock'
from blender/lib/blender/scheduler.rb:58:in `run'
from blender/lib/blender.rb:44:in `blend'
from blender/lib/blender/cli.rb:54:in `from_file'

moving the file elsewhere solved the issue.

ohadlevy commented 10 years ago

I looked into a simple patch, but it seems like the lock infrastructure is also reporting infrastructure ? e.g.

@lock_fd.write({job: @job_name, pid: Process.pid }.inspect)

it should be pretty simple to fix the issue (pure locking), maybe status reporting should be extracted to another sub system?

ranjib commented 10 years ago

@ohadlevy blender uses /tmp/ and the job file name as lock file, to avoid concurrent run of the same file in same server. I am wondering if we should switch to using /var/run instead. Blender provides a DSL method named lock_options to control the locking driver, if you set it to nil, blender will not use flock, that should also fix your problem

lock_options(nil)
theckman commented 10 years ago

/var/lock may be an even better location as it is part of the filesystem hierarchy standard.

From man hier:

/var/lock
    Lock  files are placed in this directory.  The naming convention for device lock files is
    LCK..<device> where <device> is the device's name in the file system.  The format used is
    that of HDU UUCP lock files, that is, lock files contain a PID as a 10-byte ASCII decimal
    number, followed by a newline character.
ranjib commented 10 years ago

yup. also im thinking whether we should create sha1 checksum from the absolute path as default lock file name, and switch the default behavior to not use it, that way users can opt in. if we choose to disable it by default, we can by keep the current lock file path semantics, and assume users will override this for non-default scenarios.

ranjib commented 9 years ago

locks are disabled by default now, user have to opt in specifically for this, in which case, they'll be controlling the path