matomo-org / matomo-log-analytics

Import any kind of server logs in Matomo for powerful log analytics. Universal log file parsing and reporting.
https://matomo.org/log-analytics/
GNU General Public License v3.0
224 stars 118 forks source link

Logfile import seems to fall into an infinite loop #202

Open pennyarcade opened 6 years ago

pennyarcade commented 6 years ago

Command line:

python /opt/piwig_log/matomo-log-analytics/import_logs.py --url=https://monitoring.xxx.de --login= xxx--password=xxx --recorders=4 --enable-http-errors --enable-http-redirects --enable-static --idsite=1 --enable-bots /var/log/apache2/*_access.log

On Ubuntu server, latest updates, cloned master branch of this project. There are no errors. I tried running the script over night but at some point no further log lines are processed and only the "records/sec" value goes down slowly. At first log lines seem to be processed normally. Plese tell me what further information you need to debug the problem. The goal is running the script automatically with logrotate wich is not possible with this problem.

143158 lines parsed, 140268 lines recorded, 2 records/sec (avg), 0 records/sec (current)
143158 lines parsed, 140268 lines recorded, 2 records/sec (avg), 0 records/sec (current)
143158 lines parsed, 140268 lines recorded, 2 records/sec (avg), 0 records/sec (current)
143158 lines parsed, 140268 lines recorded, 2 records/sec (avg), 0 records/sec (current)
143158 lines parsed, 140268 lines recorded, 2 records/sec (avg), 0 records/sec (current)
143158 lines parsed, 140268 lines recorded, 2 records/sec (avg), 0 records/sec (current)
143158 lines parsed, 140268 lines recorded, 2 records/sec (avg), 0 records/sec (current)
143158 lines parsed, 140268 lines recorded, 2 records/sec (avg), 0 records/sec (current)
143158 lines parsed, 140268 lines recorded, 2 records/sec (avg), 0 records/sec (current)
143158 lines parsed, 140268 lines recorded, 2 records/sec (avg), 0 records/sec (current)
143158 lines parsed, 140268 lines recorded, 2 records/sec (avg), 0 records/sec (current)
143158 lines parsed, 140268 lines recorded, 2 records/sec (avg), 0 records/sec (current)
143158 lines parsed, 140268 lines recorded, 2 records/sec (avg), 0 records/sec (current)
143158 lines parsed, 140268 lines recorded, 2 records/sec (avg), 0 records/sec (current)
143158 lines parsed, 140268 lines recorded, 2 records/sec (avg), 0 records/sec (current)
143158 lines parsed, 140268 lines recorded, 2 records/sec (avg), 0 records/sec (current)
143158 lines parsed, 140268 lines recorded, 2 records/sec (avg), 0 records/sec (current)
143158 lines parsed, 140268 lines recorded, 2 records/sec (avg), 0 records/sec (current)
143158 lines parsed, 140268 lines recorded, 2 records/sec (avg), 0 records/sec (current)
143158 lines parsed, 140268 lines recorded, 2 records/sec (avg), 0 records/sec (current)
143158 lines parsed, 140268 lines recorded, 2 records/sec (avg), 0 records/sec (current)
143158 lines parsed, 140268 lines recorded, 2 records/sec (avg), 0 records/sec (current)
143158 lines parsed, 140268 lines recorded, 2 records/sec (avg), 0 records/sec (current)
143158 lines parsed, 140268 lines recorded, 2 records/sec (avg), 0 records/sec (current)
143158 lines parsed, 140268 lines recorded, 2 records/sec (avg), 0 records/sec (current)
143158 lines parsed, 140268 lines recorded, 2 records/sec (avg), 0 records/sec (current)
143158 lines parsed, 140268 lines recorded, 2 records/sec (avg), 0 records/sec (current)
143158 lines parsed, 140268 lines recorded, 2 records/sec (avg), 0 records/sec (current)
143158 lines parsed, 140268 lines recorded, 2 records/sec (avg), 0 records/sec (current)
143158 lines parsed, 140268 lines recorded, 2 records/sec (avg), 0 records/sec (current)
143158 lines parsed, 140268 lines recorded, 2 records/sec (avg), 0 records/sec (current)
143158 lines parsed, 140268 lines recorded, 2 records/sec (avg), 0 records/sec (current)
143158 lines parsed, 140268 lines recorded, 2 records/sec (avg), 0 records/sec (current)

This is the output after running it for more than 12 hours straight.

pennyarcade commented 6 years ago

Bump... Any ideas on this? Any more information I can add to get some help on this?

kaplun commented 6 years ago

I captured on production this situation: at a given time Matomo was unreachable. import_logs.py somehow noticed the situation, however it didn't either recovered nor terminated. It kept running (accepting events on stdin) without being able to resume a valid session with Matomo.

2018-06-01 21:21:39,814: [INFO] Error when connecting to Matomo: HTTP Error 504: Gateway Time-out
2018-06-01 21:21:39,816: [INFO] Retrying request, attempt number 2
2018-06-01 21:21:40,267: [INFO] Error when connecting to Matomo: HTTP Error 504: Gateway Time-out
2018-06-01 21:21:40,269: [INFO] Retrying request, attempt number 2
Exception in thread Thread-4:
Traceback (most recent call last):
  File "/usr/lib64/python2.7/threading.py", line 812, in __bootstrap_inner
    self.run()
  File "/usr/lib64/python2.7/threading.py", line 765, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/var/www/html/piwik/misc/log-analytics/import_logs.py", line 1858, in _run_bulk
    self._record_hits(hits)
  File "/var/www/html/piwik/misc/log-analytics/import_logs.py", line 2016, in _record_hits
    on_failure=self._on_tracking_failure
  File "/var/www/html/piwik/misc/log-analytics/import_logs.py", line 1595, in call
    data=data)
  File "/var/www/html/piwik/misc/log-analytics/import_logs.py", line 1549, in _call_wrapper
    response = func(*args, **kwargs)
  File "/var/www/html/piwik/misc/log-analytics/import_logs.py", line 1485, in _call
    response = opener.open(request, timeout = timeout)
  File "/usr/lib64/python2.7/urllib2.py", line 431, in open
    response = self._open(req, data)
  File "/usr/lib64/python2.7/urllib2.py", line 449, in _open
    '_open', req)
  File "/usr/lib64/python2.7/urllib2.py", line 409, in _call_chain
    result = func(*args)
  File "/usr/lib64/python2.7/urllib2.py", line 1258, in https_open
    context=self._context, check_hostname=self._check_hostname)
  File "/usr/lib64/python2.7/urllib2.py", line 1217, in do_open
    r = h.getresponse(buffering=True)
  File "/usr/lib64/python2.7/httplib.py", line 1113, in getresponse
    response.begin()
  File "/usr/lib64/python2.7/httplib.py", line 444, in begin
    version, status, reason = self._read_status()
  File "/usr/lib64/python2.7/httplib.py", line 400, in _read_status
    line = self.fp.readline(_MAXLINE + 1)
  File "/usr/lib64/python2.7/socket.py", line 476, in readline
    data = self._sock.recv(self._rbufsize)
  File "/usr/lib64/python2.7/ssl.py", line 759, in recv
    return self.read(buflen)
  File "/usr/lib64/python2.7/ssl.py", line 653, in read
    v = self._sslobj.read(len or 1024)
SSLError: ('The read operation timed out',)

74713537 lines parsed, 70014081 lines recorded, 317 records/sec (avg), 0 records/sec (current)
74713537 lines parsed, 70014081 lines recorded, 316 records/sec (avg), 0 records/sec (current)
74713537 lines parsed, 70014081 lines recorded, 316 records/sec (avg), 0 records/sec (current)

(note line-numbers might not corresponds to current master because I have patched version of the script).

tkodjo commented 5 years ago

I am experiencing the same issue with Matomo 3.9.1 I wrote a Powershell script to loop through my daily log files but it is stalling when it calls import_logs.py My daily log files are up to 60MB The script import_logs.py seems to stall or fall into an infinite loop as mentionned by pennyarcade. Only the records/sec seems to go down while other progress indicators do not change matomo-stalls

viba1 commented 2 years ago

Same behavior using matomo 4.9.0

viba1 commented 2 years ago

Same issue on matomo docker: Matomo: 4.9.1 MySQL: 8.0.29 PHP: 8.0.19

5619 lines parsed, 4800 lines recorded, 2 records/sec (avg), 0 records/sec (current) 2022-05-13 08:53:57,140: [INFO] Error when connecting to Matomo: The read operation timed out 2022-05-13 08:53:57,142: [INFO] Retrying request, attempt number 2 ... 5619 lines parsed, 4800 lines recorded, 1 records/sec (avg), 0 records/sec (current) 2022-05-13 09:04:17,404: [INFO] Error when connecting to Matomo: The read operation timed out 2022-05-13 09:04:17,405: [INFO] Max number of attempts reached, server is unreachable! Fatal error: The read operation timed out You can restart the import of "/var/log/matomo-logs/access.log" from the point it failed by specifying --skip=4811 on the command line.

Calling with --skip=4811leads to the same behavior: log/matomo-logs/access.log 0 lines parsed, 0 lines recorded, 0 records/sec (avg), 0 records/sec (current) Parsing log /var/log/matomo-logs/access.log...

4611 lines parsed, 3800 lines recorded, 3 records/sec (avg), 0 records/sec (current) 2022-05-13 09:22:26,448: [INFO] Error when connecting to Matomo: The read operation timed out 2022-05-13 09:22:26,449: [INFO] Max number of attempts reached, server is unreachable! Fatal error: The read operation timed out You can restart the import of "/var/log/matomo-logs/access.log" from the point it failed by specifying --skip=3809 on the command line.

fabianboerner commented 2 months ago

Same here, whats the solution? Skipping is not an option because it should be automated.