Pierre-Lannoy / wp-decalog

Capture and log events, metrics and traces on your site. Make WordPress observable - finally!
https://perfops.one/
GNU General Public License v3.0
64 stars 8 forks source link

Error [0] Metric not found: wordpress_php_php:error_nonfatal #39

Closed JanThiel closed 1 year ago

JanThiel commented 1 year ago

Describe the bug We do tests with the PHP8 upgrade at the moment. After we upgraded the installation to 8.1 our logs were flooded with the "Metric not found" Error above for each and every WP-CLI run. I hope and believe it is related. But the PHP upgrade is the only recent change which could have triggered this. We were able to reproduce this with Decalog 3.6.0 as well as 3.6.2 against Datadog. We have DD Events and DD Metrics configured and setup against Datadog. When we pause / disable DD Metrics the error disappears.

Here is the stacktrace which oddly ends with decalog_run :-/

decalog_run,
include_once(`/plugins/wp-decalog/decalog.php`),
require(`wp-settings.php`),
WP_CLI\Runner->load_wordpress,
WP_CLI\Runner->start,
WP_CLI\Bootstrap\LaunchRunner->process,
WP_CLI\bootstrap,
include(`phar:///usr/local/wp/wp/vendor/wp-cli/wp-cli/php/wp-cli.php`),
include(`phar:///usr/local/wp/wp/php/boot-phar.php`)

To Reproduce Steps to reproduce the behavior:

  1. Enable and configure DD Metrics
  2. Switch a running site from PHP7.4 to PHP8.1
  3. Run any command on WP-CLI

Expected behavior No error at all :-) Or an idea / hint how to do something against this.

Screenshots

Environment (please complete the following information):

Pierre-Lannoy commented 1 year ago

Hello @JanThiel ! Thank you for this report (perfect, as always). Let me find what's going on…

Pierre-Lannoy commented 1 year ago

Hello @JanThiel ! I'm currently working on this fu***ng issue and wondering if you're in development or production profile (I think it's development profile, but I need to be sure)?

Pierre-Lannoy commented 1 year ago

I've made a fix what I assume is the error. I will release the new version in some minutes ;)

JanThiel commented 1 year ago

Although it might be too late, the environment is setup with these:

define( 'WP_ENVIRONMENT_TYPE', 'staging' );
define( 'WP_DEBUG', true );
define( 'WP_DEBUG_DISPLAY', false );
define( 'WP_DEBUG_LOG', false );
define( 'SAVEQUERIES', true );

I believe that triggers DEV mode :-)

Pierre-Lannoy commented 1 year ago

Yes :) Normally, the 3.6.3 release solves this problem. Can you provide me with feedback once you have tested it? Nothing urgent, eh!

JanThiel commented 1 year ago

Just tested and it looks fixed to me :-) Thank you very much Pierre!

JanThiel commented 1 year ago

Hey @Pierre-Lannoy I do have to reopen this issue. 3.6.3 and more sites on PHP 8.1 are popping this error again.

Pierre-Lannoy commented 1 year ago

Hello @JanThiel ! Sorry to hear that :/ That's very strange, I haven't released new version since some months now. Did you have modified something to wp-cli (version, environment, etc.)?

JanThiel commented 1 year ago

Hmmm ... wait a sec ... It's not PHP8 related obviously. The page that's causing the issues is still on PHP 7.4. In this particular case the issue is triggered by ObjectCachePro (Redis Object Cache implementation)

image

JanThiel commented 1 year ago

I believe the source of the issue is within a try/catch block when an exception is thrown.

Pierre-Lannoy commented 1 year ago

Wait, too much speed for me ;) So, your website is executed with PHP 7.4. When an error is thrown in ObjectCachePro, you have a huge flood of "Metric not found" in your log. Am I right? Is it only when you use wp-cli with PHP 8.1?

JanThiel commented 1 year ago

So, your website is executed with PHP 7.4

Yes

When an error is thrown in ObjectCachePro, you have a huge flood of "Metric not found" in your log. Am I right?

Yes

Is it only when you use wp-cli with PHP 8.1?

This was no CLI context at all. In particular this was a call against the REST API

Pierre-Lannoy commented 1 year ago

Hello @JanThiel !

Can you give some details about your DecaLog configuration and how this error is produced (i.e. why and when the OCPro error is thrown) ? I almost sure it is not the same as the previous one…

JanThiel commented 1 year ago

@Pierre-Lannoy Sure as always!

This is the Decalog Config:

define('DECALOG_DEFAULT_LOGGERS', array(
  '113b90f4-3e4f-4909-b01e-11c4ed84db11' => array (
    'name' => 'Datadog Events',
    'handler' => 'DatadogHandler',
    'running' => true,
    'level' => '250',
    'privacy' => array (
      'obfuscation' => false,
      'pseudonymization' => false,
    ),
    'processors' => array (
      0 => 'WordpressProcessor',
      1 => 'WWWProcessor',
      2 => 'IntrospectionProcessor',
      3 => 'BacktraceProcessor',
    ),
    'configuration' => array (
      'host' => 'https://http-intake.logs.datadoghq.eu/v1/input',
      'token' => '<TOKEN>',
      'buffer' => true,
    ),
  )
));

And this is the code triggering the log entries:

::command
        ....
        try {
            $result = $node->command($name, $parameters);

            $this->lastCommand = $node->lastCommand;

            return $result;
        } catch (Throwable $th) {
            try {
                $this->connectToSentinels();
            } catch (ConnectionException $ex) {
                throw new ConnectionException($ex->getMessage(), $ex->getCode(), $th);
            }
        }
        ....
::connectToSentinels
           ....
            try {
                $this->sentinel = $url;
                $this->establishConnections($url);
                $this->setPool();

                return;
            } catch (Throwable $error) {
                $this->sentinels[$url] = false;

                if ($this->config->debug) {
                    error_log('objectcache.notice: ' . $error->getMessage());
                }
            }
            ...
Pierre-Lannoy commented 1 year ago

Thanks @JanThiel 😊 I've found where it comes from. It's a "snake biting its own tail" problem: this exception (thrown by OCPro) is thrown at a time when metrics have not been already defined in DecaLog (it is very early in the loading sequence, which makes sense regarding what OCPro is doing). And DecaLog try to increment a PHP error counter which is inexistent… The only way to get around this is to have an option to "bypass" this behavior. It will be implemented in 3.7.0 but note it will not be activated by default. So, you will have to UNCHECK "Warn about non existent metrics" in options to activate this new behavior.

JanThiel commented 1 year ago

Sounds reasonable :-) Great job undigging that ugly timing issue...

Pierre-Lannoy commented 1 year ago

Version 3.7.0 released. Could you confirm it fixes this issue?

JanThiel commented 1 year ago

Hi @Pierre-Lannoy,

took some time. Sorry for that. We sadly can still reproduce this issue with 3.7.1. Caused by the exact same Code as before.

Pierre-Lannoy commented 1 year ago

Hi @JanThiel Wow! I'm just asking as a matter of fact but I'm sure the answer is yes: is the corresponding option unchecked ("Warn about non-existent metrics")?

JanThiel commented 1 year ago

🤔 No, to be honest I did not change any settings - wasn't aware of that :-/. So that is most probable the reason why it wasn't "fixed" :-)

I changed that setting now and will report back within the next weeks which is hopefully free of this error then :)

JanThiel commented 1 year ago

Looks god now. Sorry for the confusion. I close this issue.

Best Regards!

Jan