SpartnerNL / Laravel-Excel

🚀 Supercharged Excel exports and imports in Laravel
https://laravel-excel.com
MIT License
12.27k stars 1.92k forks source link

[Bug]: QueueExport (and similar) causing Integrity constraint violation on failure #3379

Closed adamgoucher closed 2 years ago

adamgoucher commented 3 years ago

Is the bug applicable and reproducable to the latest version of the package and hasn't it been reported before?

What version of Laravel Excel are you using?

3.1.33

What version of Laravel are you using?

8.61.0

What version of PHP are you using?

7.4

Describe your issue

We have converted a bunch of exports to be queued (as they are large and were just running in Console/Kernel.php and blocking) and noticed that a number of export related jobs are failing in new an interesting ways.

Our hypothesis is that because we have multiple workers responding to queued events we are using an s3 backed remote_disk to manage things. So when one of workers fails, it created a failed_job entry -- and then another fails it too tries to make a failed_job entry but we end up with stuff like this in the logs

[2021-09-24 19:35:42] production.ERROR: SQLSTATE[23000]: Integrity constraint violation: 1062 Duplicate entry '0568f8ca-fa86-4bfc-9c98-6c99bc32b707' for key 'failed_jobs.failed_jobs_uuid_unique' (SQL: insert into `failed_jobs` (`uuid`, `connection`, `queue`, `payload`, `exception`, `failed_at`) values (0568f8ca-fa86-4bfc-9c98-6c99bc32b707, redis, reporting, {

followed, unfortunately, by every model in the failed report chunk. Our logs have jumped from 300M in a day to 3G as a result of this.

How can the issue be reproduced?

Here is how we are calling a particularly cranky export

        $schedule->call(function() {
            // a whole bunch of setup stuff

            (new EmployeeRequirementStatusExport($org, $data))
                ->queue($path, $disk)
                ->allOnQueue('reporting')
                ->chain([
                    new SendReportGenericNotificationJob(['some@email.com'], $name, $disk, $path, $notificationSubject)
                ]);
        })
        ->name('a_unique_name')
        ->daily()
        ->onOneServer()
        ->environments(['production']);

As mentioned above, both remote_disk and remote_prefix are set and are using s3.

There are 2 workers processing this queue

What should be the expected behaviour?

A failed export should not cause an integrity constraint violation. And should an export fail, it should not make the failed_jobs table unusable due to its size.

patrickbrouwers commented 3 years ago

We don't really do anything with failed jobs directly, that's in Laravel territory. The size of the failed job however is related to whatever you put through the constructor. Laravel serializes that. So your $org and $data are probably quite large in size. When something does fail it will put that entire payload in your failed jobs table.

adamgoucher commented 3 years ago

I think the root of this ticket is that I was attempting to eager load within query() and so the data array passed to AppendDataToSheet and friends was enormous.

What I'm doing now, depending on context is;

  1. heavily limit the fields returned from the eager load (very rarely do I need the whole model)
  2. when the data set to be eager loaded is very large (as was the case here) the query function will only have the id field which I then create the full object and eager load it in the map function.

I'll look at creating a doc PR for this.

Why. it was trying to fail the same job multiple times thus the integrity constraint I am not sure. But once I changed eager loading strategies it went away.

patrickbrouwers commented 3 years ago

The eagerload sounds related to what I mentioned about payload size.

stale[bot] commented 2 years ago

This bug report has been automatically closed because it has not had recent activity. If this is still an active bug, please comment to reopen. Thank you for your contributions.