oban-bg / oban

💎 Robust job processing in Elixir, backed by modern PostgreSQL and SQLite3
https://oban.pro
Apache License 2.0
3.35k stars 311 forks source link

Oban.insert_all does not propagate all Job attributes #103

Closed bgentry closed 4 years ago

bgentry commented 4 years ago

Precheck

Environment

Current Behavior

I have a worker defined like this:

defmodule MyApp.Workers.UpdateThing do
  use Oban.Worker, queue: "my_queue", max_attempts: 3
  require Logger

  @impl Oban.Worker
  def perform(args, job) do
    # ...
  end
end

Note the max_attempts: 3, which AFAICT follows the instructions in the docs.

In the only place in our code where I actually enqueue his job, I run:

some_list
|> Enum.map(fn data ->
  # args_map_from_data is just a plain map, i.e. there is only one arg to this function call:
  MyApp.Workers.UpdateThing.new(args_map_from_data)
end)
|> Oban.insert_all()

Note that I am not explicitly setting max_attempts while enqueueing this job.

Expected Behavior

Despite my only usage of max_attempts declaring that it should be 3 for this job, we have a bunch of failed instances of this job that have failed 15+ times and have a max_attempts: 20 set in the database (this query was run a bit after I deleted all instances with > 6 attempts):

myapp::DATABASE=> SELECT worker, state, count(*) FROM oban_jobs WHERE attempt > 5 AND max_attempts > 5 AND worker = 'MyApp.Workers.UpdateThing' GROUP BY worker, state ORDER BY count DESC;
          worker           |   state   | count 
---------------------------+-----------+-------
 MyApp.Workers.UpdateThing | retryable |    49

myapp::DATABASE=> SELECT worker, state, attempt, max_attempts FROM oban_jobs WHERE attempt > 5 LIMIT 10;
           worker           |   state   | attempt | max_attempts 
----------------------------+-----------+---------+--------------
 MyApp.Workers.UpdateThing  | retryable |      10 |           20

I'm not sure how this could have happened 🤔 Even when testing job creation on iex, I can see it appears to be setting max_attempts correclty:

MyApp.Workers.UpdateThing.new(%{foo: "bar"})
#Ecto.Changeset<
  action: nil,
  changes: %{
    args: %{foo: "bar"},
    max_attempts: 3,
    queue: "metrc",
    worker: "MyApp.Workers.UpdateThing"
  },
  errors: [],
  data: #Oban.Job<>,
  valid?: true
>

And yet, they sure seem to be resetting the max_attempts value. The best guess I have is that this is somehow being thrown away when the job is rescued due to it raising an exception (rather than cleanly erroring with {:error, err}).

bgentry commented 4 years ago

Ah, I think I've discovered the cause of this! 💡 It's Oban.insert_all. Before calling Repo.insert_all, it first calls Job.to_map/1 which only picks certain fields from the Job structs it is inserting.

I can work on a PR though I'm not entirely certain which fields should/shouldn't be included in Job.to_map/1 or why it shouldn't be all of them.

sorentwo commented 4 years ago

I'm a bit surprised this hasn't been reported before. Then again, I don't know how often people use insert_all compared to plain insert.

Ah, I think I've discovered the cause of this! 💡 It's Oban.insert_all. Before calling Repo.insert_all, it first calls Job.to_map/1 which only picks certain fields from the Job structs it is inserting.

You're absolutely right. Well spotted.

I can work on a PR though I'm not entirely certain which fields should/shouldn't be included in Job.to_map/1 or why it shouldn't be all of them.

The only field that can't be inserted is unique, because it is virtual. Otherwise all of the fields should be acceptable. In fact, the @permitted module attribute already has a list of the fields that are acceptable.