openfoodfacts / openfoodfacts-infrastructure

Where we collaboratively plan and maintain the infrastructure of Open Food Facts
3 stars 6 forks source link

Matomo archive fails #287

Open alexgarel opened 10 months ago

alexgarel commented 10 months ago

We have a lot of message that core:archive is failing.

alexgarel commented 10 months ago

Yesterday I tried to better configure MySQL: https://github.com/openfoodfacts/openfoodfacts-infrastructure/commit/1a5ba6b65e6b963ae0b787a517408f0e4854c757

And also fixed logs in service: https://github.com/openfoodfacts/openfoodfacts-infrastructure/commit/c31fe14457c7a525e790123b66102b740c2ef89d

I found that there is a max_execution_time in php.ini that might lead to the task ending abruptely. I am trying to push it to 7200 (2h) and also temporarily changed the systemd timeout to 2h for this task. I will see tomorrow if it makes things better.

alexgarel commented 10 months ago

Still failing, and I have this message:

ERROR [2024-01-04 04:55:03] 21054  Got invalid response from API request: ?module=API&method=CoreAdminHome.archiveReports&idSite=5&period=day&date=2024-01-03&format=json&segment=countryName%3D%3DGermany&trigger=archivephp. Response was '{"result":"error","message":"SQLSTATE[HY000]: General error: 2006 MySQL server has gone away"}

I took a closer look at the code to see where this error occurs.

It seems that archive console command has two way to process archival: either by launching shell process to do the job, either by calling the server API (for each archival request).

My error seems to be triggered only in the case of calling the server API.

I verified that we should use process using php in interactive mode:

cd /var/www/html matomo/
php -a

php > include "console";
...
php > use Piwik\CliMulti\Process;
php > echo Piwik\CliMulti\Process::isSupported();
1
php > use Piwik\CliMulti;
php > $p = new Piwik\CliMulti();
php > echo $p->supportsAsync();
1

But the codes that launch requests reads like this:

   private function executeUrlCommand($cmdId, $url, $numUrls)
    {
        if ($this->supportsAsync) {
            if ($numUrls === 1) {
                $output = new StaticOutput($cmdId);
                $this->executeSyncCli($url, $output);
            } else {
                $output = new Output($cmdId);
                $this->executeAsyncCli($url, $output, $cmdId);
            }
        } else {
            $output = new StaticOutput($cmdId);
            $this->executeNotAsyncHttp($url, $output);
        }

        $this->outputs[] = $output;
    }

executeSyncCli is using API call. So it might be that we are in the case of only one url to process and matomo decided to call the API.

But then max_execution_time is far lower for php fpm.

To try to see if it fix the issue, I increment max_execution_time in /etc/php/7.3/fpm/php.ini (from 120s to 300s).

stephanegigandet commented 9 months ago

I managed to get exports of Matomo logs for the OFF web site for all days except 4 days that are still failing:

Downloading data for 2023-07-06
Command:
wget -t 1 -O matomo_app.log.scan.2023-07-06 'https://analytics.openfoodfacts.org/?module=API&method=Live.getLastVisitsDetails&idSite=2&period=day&date=2023-07-06&format=JSON&token_auth=[token]&filter_limit=-1'
--2024-01-16 10:36:45--  https://analytics.openfoodfacts.org/?module=API&method=Live.getLastVisitsDetails&idSite=2&period=day&date=2023-07-06&format=JSON&token_auth=[token]&filter_limit=-1
Resolving analytics.openfoodfacts.org (analytics.openfoodfacts.org)... 193.70.55.124
Connecting to analytics.openfoodfacts.org (analytics.openfoodfacts.org)|193.70.55.124|:443... connected.
HTTP request sent, awaiting response... 500 Internal Server Error
2024-01-16 10:41:56 ERROR 500: Internal Server Error.

Downloading data for 2023-07-07
Command:
wget -t 1 -O matomo_app.log.scan.2023-07-07 'https://analytics.openfoodfacts.org/?module=API&method=Live.getLastVisitsDetails&idSite=2&period=day&date=2023-07-07&format=JSON&token_auth=[token]&filter_limit=-1'
--2024-01-16 10:41:56--  https://analytics.openfoodfacts.org/?module=API&method=Live.getLastVisitsDetails&idSite=2&period=day&date=2023-07-07&format=JSON&token_auth=[token]&filter_limit=-1
Resolving analytics.openfoodfacts.org (analytics.openfoodfacts.org)... 193.70.55.124
Connecting to analytics.openfoodfacts.org (analytics.openfoodfacts.org)|193.70.55.124|:443... connected.
HTTP request sent, awaiting response... 500 Internal Server Error
2024-01-16 10:48:57 ERROR 500: Internal Server Error.

Downloading data for 2023-07-08
Command:
wget -t 1 -O matomo_app.log.scan.2023-07-08 'https://analytics.openfoodfacts.org/?module=API&method=Live.getLastVisitsDetails&idSite=2&period=day&date=2023-07-08&format=JSON&token_auth=[token]&filter_limit=-1'
--2024-01-16 10:48:57--  https://analytics.openfoodfacts.org/?module=API&method=Live.getLastVisitsDetails&idSite=2&period=day&date=2023-07-08&format=JSON&token_auth=[token]&filter_limit=-1
Resolving analytics.openfoodfacts.org (analytics.openfoodfacts.org)... 193.70.55.124
Connecting to analytics.openfoodfacts.org (analytics.openfoodfacts.org)|193.70.55.124|:443... connected.
HTTP request sent, awaiting response... 500 Internal Server Error
2024-01-16 10:52:58 ERROR 500: Internal Server Error.

Downloading data for 2023-09-06
Command:
wget -t 1 -O matomo_app.log.scan.2023-09-06 'https://analytics.openfoodfacts.org/?module=API&method=Live.getLastVisitsDetails&idSite=2&period=day&date=2023-09-06&format=JSON&token_auth=[token]&filter_limit=-1'
--2024-01-16 10:52:58--  https://analytics.openfoodfacts.org/?module=API&method=Live.getLastVisitsDetails&idSite=2&period=day&date=2023-09-06&format=JSON&token_auth=[token]&filter_limit=-1
Resolving analytics.openfoodfacts.org (analytics.openfoodfacts.org)... 193.70.55.124
Connecting to analytics.openfoodfacts.org (analytics.openfoodfacts.org)|193.70.55.124|:443... connected.
HTTP request sent, awaiting response... 500 Internal Server Error
2024-01-16 10:53:31 ERROR 500: Internal Server Error.