magento / community-features

Magento Features Development is an Initiative to Allows Community Memebers Join to Development of Magento Features
46 stars 18 forks source link

Improve Query Logging for Magento #137

Open Jakhotiya opened 5 years ago

Jakhotiya commented 5 years ago

Abstract

At the moment, when you turn on query logging, you can only filter queries using the time parameter. We need to be able to log queries for

  1. specific request URL
  2. if the query contains specific keyword.
  3. Query belongs to a particular area.

Other wise we end up with a log size of several GB's Often on live environments, we encounter intermittent issues. We may not know how to replicate them. OR developer working with extension that deals with third party API, like M2epro. And it's painful to figure out how a certain thing is happening.
For the past year, I've been using Magento 2 query log along with stack traces to figure out "How" something happens or how to replicate an issue. For example If I know that a bug is causing changes in a table A, I need to know all queries and the stack traces causing change in table A. So I keep the query log on for a day, in order to catch the intermittent issue. I then analyse the log. But this is a painful experience because I might be looking at 3-4 GB of log. Timing filters are not very useful if you don't know "what the query is " OR "how much time does it take on average" Having advanced filters would make the debugging experience better.

We can implement three kinds of filters

  1. Area code (cron,frontend,backend,webapi)
  2. By request URI
  3. Keyword

Random thoughts on implementation.

For the time being I go into the core file vendor/magento/framework/DB/Logger/File.php I make changes to this file directly on the server when I want advanced filtering. I add

  1. RequestInterface
  2. Magento\Framework\App\State as dependencies

to the constructor.

Next I make changes to log function and logStats function

 /**
     * {@inheritdoc}
     */
    public function log($str)
    {
        $str = '## ' . date('Y-m-d H:i:s') . "\r\n" . $str;
       //Also add the request log the query log
        $request = '//'!==$this->request->getFullActionName('/') ? $this->request->getFullActionName('/') : 'unknown';
        $str = '## REQUEST: '.$request."\r\n". $str;

        $stream = $this->dir->openFile($this->debugFile, 'a');
        $stream->lock();
        $stream->write($str);
        $stream->unlock();
        $stream->close();
    }

    /**
     * {@inheritdoc}
     */
    public function logStats($type, $sql, $bind = [], $result = null)
    {
   $this->state = \Magento\Framework\App\ObjectManager::getInstance()->get(\Magento\Framework\App\State::class);
        $this->keyword = false;
        $this->areaCode = 'adminhtml';

        //if keyword is specified and does not exist in query, don't log it 
        // may be we could use something better that strpos in next iteration
        if($this->keyword && false===strpos($sql,$this->keyword)){
            return;
        }
        $currentAreaCode = 'global';
        try{
            $currentAreaCode = $this->state->getAreaCode();
        }catch(\Exception $ex){

        }
        /*
        * log everything only when area code is not specified or when current area code matches the 
         specified area code
        */
        if($this->areaCode!=='global' && $currentAreaCode!==$this->areaCode){
            return;
        }

        $stats = $this->getStats($type, $sql, $bind, $result);
        if ($stats) {
            $this->log($stats);
        }
    }

This is very raw implementation. I extracted this whole code into class "\Magento\Framework\DB\Logger\Advanced".

I would want all of those parameters to be configurable using "app/etc/env.php"

May be we could modify the parameters to look like

[
'db_logger' => [
    'output' => 'advanced',
   'keyword'=>'<my keyword>',
    'area_code'=>'cron'
    'log_everything' => 1,
    'query_time_threshold' => '0.001',
    'include_stacktrace' => 1
  ]
]

If this sounds useful to other developers too , can anybody guide me on how I can wire it existing Logging mechanism.

buskamuza commented 5 years ago

Thanks for the idea.

  1. Agree that config for the logger should be in a config file rather than in di.xml. env.php seems like a good place
  2. keyword should be ok to inject
  3. I would not inject request and state in the logger directly as it will make them dependent on global state (e.g., request will not work in CLI mode). Instead I'd consider having a factory that creates the logger for the current process based on area and request (either a Null logger or Real logger will be created). Then CLI and action controllers (and Web APIs) would create different loggers. This part needs more details to understand where this factory should be in the application. This can also be a proxy, which can be installed as an extension or a customization.
speedupmate commented 5 years ago
  1. Agree that config for the logger should be in a config file rather than in di.xml. env.php seems like a good place

3. I would not inject request and state in the logger directly as it will make them dependent on global state (e.g., request will not work in CLI mode).

This should be made possible cause its the most important information to understand the context of a issue under investigation.

Even if its coupling it can be optional or usable:

  1. I also suggest to log host specific data , since on cloud hosts come and go and multiple are deployed at once , adds more context to understand what host was affected, used for logging etc
joni-jones commented 5 years ago

This should be made possible cause its the most important information to understand the context of a issue under investigation.

I also don't like an idea to include the request details to DB logging. But as a possible approach to achieve the context - correlation ID might be used to combine all logging details (not only DB) in the same context.