matomo-org / matomo-for-wordpress

Get a fully functioning Matomo Analytics for your WordPress. Star us on Github? +1. Matomo is the leading open alternative to Google Analytics that gives you full control over your data. Privacy is built-in. 100% data ownership, no one else can see your data. We love Pull Requests!
https://matomo.org
GNU General Public License v3.0
119 stars 25 forks source link

Visits Log returned "The date '1970-01-01' is a date before first website was online" #215

Closed oxcid closed 4 years ago

oxcid commented 4 years ago

Hi,

Starting this evening, when I tried to open Visits Log, or Ecommerce Log, and selecting date February 15-16, I get this error:

Screenshot 2020-02-16 03 44 24

This is the URL:

https://example.com/wp-content/plugins/matomo/app/index.php?module=CoreHome&action=index&idSite=1&period=day&date=yesterday#?idSite=1&period=day&date=2020-02-15&segment=&category=General_Visitors&subcategory=Live_VisitorLog

Also I checked at MySQL log, this is the query executed:

SELECT log_visit.* FROM wp_matomo_log_visit AS log_visit WHERE log_visit.idsite in ('1') AND log_visit.visit_last_action_time >= '2020-02-14 17:00:00' AND log_visit.visit_last_action_time <= '2020-02-15 20:34:43' ORDER BY log_visit.idsite DESC, log_visit.visit_last_action_time DESC LIMIT 0, 102

So I see the dates value are correct (timezone GMT+7), but I got the error above. Other dates from February 14 backward work fine. Other pages also work fine, including Real-time, it keeps tracking visits.

No PHP error at the logs.

Thanks.

tsteur commented 4 years ago

Thanks @oxcid

I can't reproduce this here and we have also not heard of a similar issue from other On-Premise users.

Could you maybe post the content of your Matomo => System report here?

Can you also in general still reproduce this issue?

oxcid commented 4 years ago

Hi @tsteur , I think I'm on to something here.. My initial report, the problem occurs for February 15-16, but today, the problem occurs for February 15, and 17. February 16 is displayed perfectly.

Also I noticed the problem for date 15 and 17 is gone when I set the "Rows to display" at the bottom to 25, only to reappear when I try to go to the next page (or several next pages). Is it possible that there's a corrupted or invalid row? How to make sure of this?

And may I know where do you trigger the error message above? I want to understand which event caused that, and maybe to pinpoint which field uses the timestamp value stated in the error message.

EDIT: I've found out this is called from Date::factory. Still, I don't know where it got the $datestring value "2020". For now I assume some rows have an invalid date/time value, which only consists of "2020".

System report:

` # Matomo

Endpoints

Crons

Mandatory checks

Optional checks

Matomo Settings

Logs

WordPress

WordPress Plugins

MU Plugins

Plugins

Server

Database

Browser

oxcid commented 4 years ago

Hi @tsteur , I found the invalid rows, it's in table "wp_matomo_log_conversion" and "wp_matomo_log_conversion_item", field server_time.

Screenshot 2020-02-17 17 09 39

And here's a comparison between the server_time and the visit first & last action time.

Screenshot 2020-02-17 17 23 24

During those times, there are no PHP errors, and no server downtime etc. I checked the users from 2 of those rows as examples, there are no similarities between the users. 1 is using desktop, Win10, Firefox, while the other is using mobile, iOS, Safari.

When I changed the server_time in table wp_matomo_log_conversion to a time between visit first and last action time, the problem is gone. Nevertheless I also changed the server_time in table wp_matomo_log_conversion_item.

Until now, I don't know what caused the wrong server time at the first place. On 4 separate visits, at different dates, all server_time has the exact same value. I hope this is just a one-time glitch, but is there a way you can add a validation for the server_time field? Or at least throws an error directly when it happens? That way we know what's causing this.

Thanks.

PS: Out of topic, would you mind to take a look at this? Thanks. https://forum.matomo.org/t/how-to-ignore-visitorreferrer-from-payment-gateways/36002

tsteur commented 4 years ago

Cheers for debugging this @oxcid I have created a PR in https://github.com/matomo-org/matomo/pull/15587 that will hopefully fix this.

oxcid commented 4 years ago

Thanks @tsteur , really appreciate it! Just to update, this morning I got another invalid server_time value, exactly like the above. For now I'll fix these manually, no worries. This has never happened when I was still using the beta version (0.4 if I remember correctly), so anything causing this must be an update after that.

Thanks.

tsteur commented 4 years ago

Cheers for letting us know 👍 I've had a look again and I have kind of an idea what it is related to. Created https://github.com/matomo-org/wp-matomo/pull/217 and we're also generally looking for some additional fix in https://github.com/matomo-org/matomo/pull/15587 on top of the already done fix.

tsteur commented 4 years ago

Closing this for now @oxcid I'm hoping it'll be fixed from the next release. Feel free to comment otherwise and I'll be happy to have another look

oxcid commented 4 years ago

Hi @tsteur , I've updated to Matomo ver 1.0.3 probably 2 days ago, but last night this error comes up again, still exactly the same as before. Can you please have another look at this?

Thanks.

tsteur commented 4 years ago

@oxcid I've been looking through the code and debugging for quite some time but have no idea how this could potentially happen. Can you double check it happened again in the log_conversion issue and it was indeed a visit that was tracked after the update?

oxcid commented 4 years ago

Hi @tsteur , I can confirm it's indeed the log_conversion issue again, with the exact same value "1970-01-01 00:33:40", and tracked after the plugin update. I've traced the visit log based on the action_time and HEX(idvisitor), here's the log:

Screenshot 2020-02-28 13 41 42

EDIT: There is time difference between the log in database and the report, since the database is using GMT, while the report is using GMT+7 (my local time). Not sure if this info can make a difference or not.

oxcid commented 4 years ago

Hi @tsteur , just to update, there are no more invalid rows right now, 4 days after my last reply. So I'd guess it's just a cache issue back then. Thanks again, I'll let you know in case this issue is back.

tsteur commented 4 years ago

Awesome, thanks. Feel free to comment any time if it happens again.

oxcid commented 4 years ago

Hi @tsteur , I've thought this issue as solved, so I didn't check for a few days. But sadly, it does still occurs.. Please find the SQL queries below.

I'm thinking.. could a cache mechanism cause this? I'm not sure how though..?

Thanks.

Screenshot 2020-03-11 16 12 40
oxcid commented 4 years ago

Hi @tsteur , I've created some SQL triggers, before INSERT and UPDATE, for those tables (wp_matomo_log_conversion, wp_matomo_log_conversion_item). First purpose is to log which event causes the invalid rows, and second purpose is to fix those invalid rows, so I won't have to do it manually everyday.

Here are the triggers (DATE_SUB is to calculate my local time zone difference with UTC):

CREATE TABLE dp_matomo_err_log (idvisit bigint(10) unsigned,event varchar(6),tbl varchar(4),now datetime, matomo datetime);

DELIMITER $$

CREATE TRIGGER dpMatomoConvUpdTrigger
BEFORE UPDATE
ON wp_matomo_log_conversion
  FOR EACH ROW BEGIN
    IF (NEW.server_time LIKE '1970-01-01%') THEN
      INSERT INTO dp_matomo_err_log VALUES (NEW.idvisit,'UPDATE','CONV',DATE_SUB(now(),INTERVAL 7 HOUR),NEW.server_time);
      SET NEW.server_time = DATE_SUB(now(),INTERVAL 7 HOUR);
    END IF;
  END$$

CREATE TRIGGER dpMatomoConvItemUpdTrigger
BEFORE UPDATE
ON wp_matomo_log_conversion_item
  FOR EACH ROW BEGIN
    IF (NEW.server_time LIKE '1970-01-01%') THEN
      INSERT INTO dp_matomo_err_log VALUES (NEW.idvisit,'UPDATE','ITEM',DATE_SUB(now(),INTERVAL 7 HOUR),NEW.server_time);
      SET NEW.server_time = DATE_SUB(now(),INTERVAL 7 HOUR);
    END IF;
  END$$

DELIMITER ;

DELIMITER $$

CREATE TRIGGER dpMatomoConvInsTrigger
BEFORE INSERT
ON wp_matomo_log_conversion
  FOR EACH ROW BEGIN
    IF (NEW.server_time LIKE '1970-01-01%') THEN
      INSERT INTO dp_matomo_err_log VALUES (NEW.idvisit,'INSERT','CONV',DATE_SUB(now(),INTERVAL 7 HOUR),NEW.server_time);
      SET NEW.server_time = DATE_SUB(now(),INTERVAL 7 HOUR);
    END IF;
  END$$

CREATE TRIGGER dpMatomoConvItemInsTrigger
BEFORE INSERT
ON wp_matomo_log_conversion_item
  FOR EACH ROW BEGIN
    IF (NEW.server_time LIKE '1970-01-01%') THEN
      INSERT INTO dp_matomo_err_log VALUES (NEW.idvisit,'INSERT','ITEM',DATE_SUB(now(),INTERVAL 7 HOUR),NEW.server_time);
      SET NEW.server_time = DATE_SUB(now(),INTERVAL 7 HOUR);
    END IF;
  END$$

DELIMITER ;

And after a few days, here are the log tables:

wp_matomo_log_conversion (so far I see only UPDATE queries causing invalid rows here)

Screenshot 2020-03-18 16 45 54

wp_matomo_log_conversion_item

Screenshot 2020-03-18 16 46 24

Hope this can help you fix this issue.

Thanks.

tsteur commented 4 years ago

Thanks for this that's definitely helping. Sorry about replying to late.

I'll try another patch in the next release which is today or tmrw. It might not help though. If that doesn't work though I will try debug again. I'm thinking it might be related to some MySQL setting maybe or so but can't really imagine as it only happens in the log conversion tables but doesn't seem to happen in wp_matomo_log_visit.

tsteur commented 4 years ago

BTW can you again confirm this is only happening for some entries but not for all? Wonder if maybe all inserts work but all updates fail or so. Probably not too easy to find out.

tsteur commented 4 years ago

And one last thing:

What is the output of show create table wp_matomo_log_conversion_item?

oxcid commented 4 years ago

BTW can you again confirm this is only happening for some entries but not for all? Wonder if maybe all inserts work but all updates fail or so. Probably not too easy to find out.

@tsteur Confirmed, only some, and not that many actually. That's why I created the log table, just to find out which ones having invalid values. One thing you may also have noticed, currently we're in year 2020, and in timestamps term, '2020' = '1970-01-01 00:33:40'. Thought maybe there's an issue somewhere during the dates parsing, and only taken the year value?

For table wp_matomo_log_conversion, only updates gives the invalid values, while for table wp_matomo_log_conversion_item, updates and inserts gives invalid values.

Here's the output of show create table wp_matomo_log_conversion_item; :

CREATE TABLE `wp_matomo_log_conversion_item` (
  `idsite` int(10) unsigned NOT NULL,
  `idvisitor` binary(8) NOT NULL,
  `server_time` datetime NOT NULL,
  `idvisit` bigint(10) unsigned NOT NULL,
  `idorder` varchar(100) NOT NULL,
  `idaction_sku` int(10) unsigned NOT NULL,
  `idaction_name` int(10) unsigned NOT NULL,
  `idaction_category` int(10) unsigned NOT NULL,
  `idaction_category2` int(10) unsigned NOT NULL,
  `idaction_category3` int(10) unsigned NOT NULL,
  `idaction_category4` int(10) unsigned NOT NULL,
  `idaction_category5` int(10) unsigned NOT NULL,
  `price` float NOT NULL,
  `quantity` int(10) unsigned NOT NULL,
  `deleted` tinyint(1) unsigned NOT NULL,
  PRIMARY KEY (`idvisit`,`idorder`,`idaction_sku`),
  KEY `index_idsite_servertime` (`idsite`,`server_time`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

Thanks.

tsteur commented 4 years ago

One thing you may also have noticed, currently we're in year 2020, and in timestamps term, '2020' = '1970-01-01 00:33:40'. Thought maybe there's an issue somewhere during the dates parsing, and only taken the year value?

I actually did not notice that. so it uses always 00:33:40? That's indeed 2020 seconds. So somehow time() = 2020.

Almost seems like some PHP bug but I'm doubting that is the case (if so, it likely be in date("Y-m-d H:i:s", $timestamp)). That this occurs randomly makes this really tricky. I've had a look through code again but cannot explain this yet.

tsteur commented 4 years ago

@oxcid looked again into things and likely the visit_last_action_time field in the goal manager code might randomly in some edge case end up being 2020-XX-XX XX:XX:XX. I have not yet figured out why but be great if you could confirm the patch in https://github.com/matomo-org/wp-matomo/pull/253/files works?

I have otherwise another idea but be good to see if that works. Looking through the code I have not yet seen though where this happens.

tsteur commented 4 years ago

@oxcid be great to let me know if you could give it a try.

oxcid commented 4 years ago

@tsteur Sorry for the late reply, been hectic lately on my side. I probably skipped a few updates, but I noticed since I updated to version 1.0.5 (since last Friday), my trigger events don't catch anything that needed to be fixed anymore. So any fix you've implemented around that version definitely works! Thanks so much!

tsteur commented 4 years ago

Awesome, great to hear and thanks for all your help @oxcid