Open hennevogel opened 7 years ago
Then we need to check first if there is one job pending for this queue and then skip the creation of another job in that case....
Look at #3342, we don't really use queues. And this would mean a solution to the ever growing queue. But it also would mean a completely different behavior for SendEventEmails
. Right now there is something like a constant job for it. As soon as one SendEventEmails
finished the next one runs (because there is always one in the queue)....
I now we were not using it and we are now going to start using it... my point was more about if we want to have just one job in the queue pending not a ton of it... per job kind/queue. In this case maybe makes sense to only have one job pending
Or we can do something like this in SendEventEmails job so that a subsequent job will not try and send emails for an event that is currently being processed by the previous job:
def perform
events = Event::Base.where(state: :unprocessed).order(:created_at).limit(1000)
events.update_all(state: :processing)
events.each do |event|
# Send emails, create notifications, create log entries etc.
event.update_attributes(state: :processed)
end
end
That would be the best solution I guess. Would also make it possible to run more than on DJ worker for the mailers queue.
Sounds also like a strategy for all the other jobs.
If we are going to kind of "locking" records per each job, we also need to take care about failures. At least to not let them in 'processing' state forever. I see two possible cases:
Can be that we use a timeout for that 'processing' state? That way we will retry those on 'processing' state after a while.
Also we can use the id of the worker / job in a field to know who is the responsible for processing that event...I don't know if this could help in any way
Can be that we use a timeout for that 'processing' state? That way we will retry those on 'processing' state after a while.
DelayedJob is built to handle timeouts for jobs which take too long and retries etc. . I think we should use their functionality rather then re-creating it in the Events.
We could assign event ids to each job for processing i.e. send_event_emails.rb
def perform(event_ids)
events = Event::Base.find(event_ids)
events.each do |event|
# Send emails, create notifications, create log entries etc.
event.update_attributes(state: :processed)
end
end
And then in clock.rb:
every(30.seconds, 'send notifications') do
events = Event::Base.where(state: :unprocessed).order(:created_at).limit(1000)
event_ids = events.pluck(:id)
events.update_all(state: :processing)
SendEventEmails.new(event_ids).delay(queue: 'mailers').perform
end
That way its DelayedJob's responsibility to take care of stuck or failed jobs and stuff like that.
Mmmm, but then we will need a way of recovering that failed jobs, I mean, a job is retried maybe 3 times (depending on the config), but after that those ids are never retried because the job is failed.
I rather prefer to have jobs that just process events, the pending ones... no matter what ids they have. That way we will have to take care about events failed not about jobs failed... jobs will never fail.
We will end up having single events that maybe because a wrong email are not possible to process for a subscriber. But at least not a batch of 1000 events that will remain unprocessed because the first event in that batch is impossible to process until an admin goes there to see what happen in that job and fix it.
In terms of fixing things If we have single events failing then is easier to fix what is causing the problem instead of having to deal with all the possible problems in 1000 events to have something being processed.
I rather prefer to have jobs that just process events, the pending ones... no matter what ids they have. That way we will have to take care about events failed not about jobs failed... jobs will never fail.
LGTM
There is also https://github.com/collectiveidea/delayed_job#hooks but failing jobs should not be about batches, I agree.
The clockwork setup assumes that the time between scheduling a delayed job and the execution of the job is smaller than the interval. For instance
doesn't mean that mails are sent every 30 seconds. It means that a new
SendEventEmails
job is added to the queue every 30 seconds.Which means that if either the queue has other jobs that take more than 30 seconds, or if the execution of
SendEventEmails
takes longer than 30 seconds you will have 2SendEventEmails
jobs.This is naturally often the case which means that for instance on build.o.o we have an ever growing queue of
SendEventEmails
.