freescout-helpdesk / freescout

FreeScout — Free self-hosted help desk & shared mailbox (Zendesk / Help Scout alternative)
https://freescout.net
GNU Affero General Public License v3.0
2.67k stars 458 forks source link

Issue with fetch emails mutex and system page #3970

Closed borgarlie closed 1 month ago

borgarlie commented 1 month ago

Hi,

In the current version, there is a bug that can cause jobs to be killed if you go into the systems page at the wrong time (while fetch emails is running).

The reason seems to be that the mutex name that is cached is set differently depending on whether the system is running the kernel or it is run on behalf of a user (through going to the systems page).

If you log the name when setting it, it becomes like this for us:

When it is run by the scheduler, or through running a artisan console command:

Saved mutex name to cache. Mutex name: framework/schedule-277c03153c5eb992673e5ee6f2baaca791d1ff28

When someone enters the system page:

Saved mutex name to cache. Mutex name: framework/schedule-1bad10e7c5ffe65a8da1f01adfa89cce2d3410d7

The second one here does not actually exist, and thus the job is killed.

This is the code that sets the mutex name in cache:

\Cache::put('fetch_mutex_name', $fetch_command->mutexName(), self::FETCH_MAX_EXECUTION_TIME);

I have no idea why it would generate different name depending on whether the kernel is run by user (through visiting the system page) vs. when it is run through the scheduler / an artisan command in the cli.

Help appreciated!

Versions: PHP: 8.2.16 Freescout version: 1.8.135 Database: PostgreSQL Are you using CloudFlare: No

freescout-helpdesk commented 1 month ago

The System page simply checks if there are more than one fetch-emails command running and kills them if there are: https://github.com/freescout-helpdesk/freescout/blob/dist/app/Http/Controllers/SystemController.php#L125

It does not check the mutex.

borgarlie commented 1 month ago

On line 204, it calls:

$migrations_output = \Helper::runCommand('migrate:status');

Which runs the Kernel. Hence this happens.

borgarlie commented 1 month ago

The problem is that it sets a new name in the cache, which becomes different when it is run through the systems page for some reason. As explained in the original message here.

borgarlie commented 1 month ago

If you can think of a solution to it, feel free to send us a request to review it. We are happy to test it in our instance.

The whole Commands, Scheduler and Kernel in Laravel / artisan is a bit cryptic to us, since we are new to it. I assume it will be a lot quicker for someone that knows the internals to get to the bottom of this kind of issue.

freescout-helpdesk commented 1 month ago

Try to change return true to return false:

https://github.com/freescout-helpdesk/freescout/blob/dist/app/Console/Kernel.php#L257

borgarlie commented 1 month ago

Hi,

Going to test that. It does indeed look like that function is wrong currently. I would suggest removing the ! instead of returning false though.

However, to resolve the issue in this thread, I think we also need to do this:

if ($this->isScheduleRun()) {
    \Cache::put('fetch_mutex_name', $fetch_command->mutexName(), self::FETCH_MAX_EXECUTION_TIME);
}

since the cache will otherwise be populated during the kernel call from UI, and in the next run from schedule, it will have the wrong value, killing the fetch job.

I don't know if this can have other side-effects. E.g. is it possible for the actual fetch command to be run during the call to kernel from UI? Or does the scheduler already make sure that doesn't happen?

We will anyways go ahead and test this.

freescout-helpdesk commented 1 month ago

We've made some adjustments in the master branch.

borgarlie commented 1 month ago

Looks plausible. We will do some more testing tomorrow.

borgarlie commented 1 month ago

Haven't found any issues with this change, and it resolves the original issue in this thread.