Closed KushalP closed 5 years ago
@lpil would appreciate your eyes on this, we need to get it perfect first time since we are requiring clients to run a migration (AKA this is a breaking change).
@sadir Your eyes would be useful too ;)
@samphilipd is the preference for a pull request to be:
How is the user expected to deploy this change and run the migration without causing the new code to crash repeatedly due to the missing column?
I am not sure, honestly. I would expect any migrations to be run before the worker application begins again.
I would have thought we don't want the application to query using the column until after the database has the new schema either through manual configuration or some cleverness at boot.
Would we not expect a user to run any remaining migrations before continuing?
Many (possibly the majority) of our users will want to perform zero downtime deployments, so they will not be turning off the application before deployment and migrations. Depending on exactly how the user is doing deployment there may be old versions running post migration, new versions running pre migration, or both. They should all run without crashing.
@samphilipd , do we have a preferred approach here? I might suggest we try post-migration query first, parse any error, and if the error is about the missing column we emit a warning and then use the pre-migration query. We would also need to do this for job enqueuing.
Another option is just to store the priority in rihanna_internal_meta
, then we don't need to do a migration at all. I imagine this could be detrimental to performance though.
Now that I read how the dispatcher is working it seemed we're hard coded to explode if there's a migration change as check_upgrade_not_required
throws an exception in init/1
. Is this what we want? If so we are forcing users to go through a more elaborate multi-deploy migration process and we should produce some documentation for them so that they understand how to do this.
@lpil I've reviewed this again and not sure if there are any downtime concerns.
The standard deployment practice is:
Old code running Run migration New code running
I don't know of anybody who deploys like:
Old code running New code running Run migration
Can you outline the scenario where deploying the migration as included would cause downtime?
I don't know of anybody who deploys like:
It is common. For example, it is the recommended (and easiest to implement) approach on Heroku and some other PaaS. I have also worked at multiple companies where deployments are run this way.
If we have explicit requirements for deployment of this version I think we should perform a major version bump and include documentation on how to safely perform the upgrade.
So our two options:
As for (2) I believe we could use the rihanna_internal_meta
field and add a key called "priority". Then the index would look like:
CREATE INDEX #{table_name}_priority_enqueued_at_id ON #{table_name} ((rihanna_internal_meta->'priority')::int DESC, enqueued_at ASC, id ASC);
It would be well worth testing the query with EXPLAIN ANALYZE to make sure it uses the index properly.
If we use the JSON approach, it will not cause downtime no matter how the application is deployed (I originally added it for this specific purpose).
✅ Old code on new schema ✅ New code on old schema
The only caveat is that there is likely to be a performance hit since we are using a JSONB lookup right in the critical hot code path. Given that, my vote is for using a column and making a breaking change.
Do we think this is acceptable @sadir @lpil ?
I'm in favour of the breaking change. :)
Yeah breaking change for me too please - I'd rather not have something critical to the performance of the application in a JSONB column.
@KushalP would you like to make these changes? Otherwise I can.
@KushalP what's the situation with this? Do you still need this feature?
If not I will close the PR.
I'm looking forward to this feature. If @KushalP doesn't want to finish it, I can try to carry it to the finish line.
Anyone? Bueller?
Closing in favour of #65
This PR implements the feature defined in #56. It adds a
priority
method and serialised field which can be overridden by the user to define how high (or low) a job priority is. This allows higher priority jobs to be scheduled ahead of lower priority jobs.