renlok / WeBid

The official WeBid github fork
http://www.webidsupport.com
114 stars 124 forks source link

Cron lock? #498

Open robberteggermont opened 6 years ago

robberteggermont commented 6 years ago

Multiple mails were sent out after an item sold. I suspect the cron code got executed more than once at the same time. This should be controlled by a locking mechanism (mutex; I could find one in the current code?).

renlok commented 6 years ago

could this be done with database flag in the settings or something

robberteggermont commented 6 years ago

On 01-12-17 21:22, Chris Dickenson wrote:

could this be done with database flag in the settings or something

I'm not a programmer, but I was thinking to put a timestamp in the database. Update the timestamp if the current timestamp was more than 60 seconds ago (UPDATE settings SET cron_timestamp=:now WHERE cron_timestamp<=:one_minute_ago), and if successful (1 row updated) run cron. If not successful, another instance has already run cron in the past minute, don't run it again.

This way, the check itself is non-blocking, there can be only one instance of cron (assuming cron finishes within a minute), cron is rate-limited to once per minute, and there is no risk that a failure to release the lock will permanently prevent cron from running.

Just my 2c. ;-)

-- Robbert Eggermont Intelligent Systems Support & Data Steward R.Eggermont@tudelft.nl Electr.Eng., Mathematics & Comp.Science +31 15 27 83234 Delft University of Technology

MESWEB commented 6 years ago

My script is working fine and I have one e-mail notification.

robberteggermont commented 6 years ago

My script is working fine and I have one e-mail notification.

Your comment is not very helpful...

We had a serious problem with this: 3-4 duplicate rows in the winners table for dozens of auctions (using non-batch cron, when many auctions closed at the same time). Clearly multiple cron instances were processing the same auctions at the same time. This is a problem, it should not be possible. When you provide the option to use non-batch cron, it should work safely.

renlok commented 6 years ago

9da4811c6f8b9f45ea1faf391077eaf1fdbe149e should work to stop multiple instances from running

robberteggermont commented 6 years ago

I'm not sure that will work.

In the time between loading $system->SETTINGS and actually checking $system->SETTINGS['cronRunning'] another instance could already have started cron too.

For this to work you will need to access the database directly, and make sure to use locking or atomic operations so no two instances can modify the value at the same time.

-- Robbert Eggermont Intelligent Systems Support & Data Steward R.Eggermont@tudelft.nl Electr.Eng., Mathematics & Comp.Science +31 15 27 83234 Delft University of Technology

renlok commented 6 years ago

OK I have replaced the database lock with a file lock that seems to be the most recommended way to deal with the problem

robberteggermont commented 6 years ago

I briefly looked at 1a73eb9 and the locking concept seems solid. (No rate-limiting, but that was just a nice extra.;-))

I'm wondering about the file path though, shouldn't that be something like MAINPATH . '/cache/cron.lock'? (I guess it should be under MAINPATH, and a more unique and descriptive name may come in handy if you every need another lock in the future?)

renlok commented 6 years ago

Yeah your right

ratfish commented 3 years ago

Hi

I also had duplicate emails being sent when auctions closed concurrently and multiple people were online (1.2.2).

I solved it prior to finding this conversation.

I looked at the php flock but found it lacking It has problems on nfs and vfat file systems It has a race condition between setting and checking the lock It has no way to detect or mitigate dead locks or looping scripts it has issues with file path, ownership and permission (especially when both batch and non batch tasks are active)

The solution that is working for me utilizes a purposed database table which has a unique constraint on a field for lock name so the script may attain a lock that is atomic relative to the script and save an associated process identifier that is used to catch dead locks. If a script can insert a record with the locks name, it has obtained the lock and upon completion can delete the record to release the lock. If the insert fails the script can skip the cron tasks. when an actual crond task runs to check for dead locks, if the record exists and the process is not running it can remove the lock, if the process is running it can note the process identifier and address a looping script on its next processing cycle

To implement required database table with unique constraint clone and modify direct query to return status rather then log error in its catch clause code to insert and delete the record in the script and a script that detects deadlocks, this script can use ps on all *nix or /proc on linux

example available if anyone is interested