Open feikede opened 4 years ago
Random thought here.
I haven't tried to reproduce this. Looking at the code we would use the most recent invalidated archive though because of https://github.com/matomo-org/matomo/blob/3.13.3/core/DataAccess/Model.php#L243 and https://github.com/matomo-org/matomo/blob/3.13.3/core/DataAccess/ArchiveSelector.php#L128-L136 where we should use the first idArchive which is the most recent. Quickly looked into it and would say it almost can't be that.
Could be some other race condition as I've never heard of it before. Like not sure if at some point maybe a done flag is written to early for example.
@feikede do you know if you have browser archiving enabled? You can see this in Admin -> General Settings. Also for how long is the empty result returned if you know? Like are we talking few seconds here or few minutes? Thanks for mentioning the issue and your help 👍
@tsteur Thanks for your analysis. As for your questions:
Is it normal to have about 2500 tables in the database?
Maybe the matomo-archive.log helps?
INFO [2020-02-28 12:05:01] 26904 ---------------------------
INFO [2020-02-28 12:05:01] 26904 INIT
INFO [2020-02-28 12:05:01] 26904 Running Matomo 3.13.3 as Super User
INFO [2020-02-28 12:05:01] 26904 ---------------------------
INFO [2020-02-28 12:05:01] 26904 NOTES
INFO [2020-02-28 12:05:01] 26904 - Async process archiving supported, using CliMulti.
INFO [2020-02-28 12:05:01] 26904 - Reports for today will be processed at most every 900 seconds. You can change this value in Matomo UI > Settings > General Settings.
INFO [2020-02-28 12:05:01] 26904 - Reports for the current week/month/year will be requested at most every 3600 seconds.
INFO [2020-02-28 12:05:01] 26904 - Archiving was last executed without error 59 min 54s ago
INFO [2020-02-28 12:05:01] 26904 - Will invalidate archived reports for 2020-02-28 for following websites ids: 1
INFO [2020-02-28 12:05:01] 26904 ---------------------------
INFO [2020-02-28 12:05:01] 26904 START
INFO [2020-02-28 12:05:01] 26904 Starting Matomo reports archiving...
INFO [2020-02-28 12:05:01] 26904 - tracking data found for website id 1 since 2020-02-28 11:05:07 UTC (since the last successful archiving)
INFO [2020-02-28 12:05:01] 26904 Will pre-process for website id = 1, period = day, date = last1
INFO [2020-02-28 12:05:01] 26904 - pre-processing all visits
INFO [2020-02-28 12:05:02] 26904 Archived website id = 1, period = day, 0 segments, 472 visits in last 2 days, 113 visits today, Time elapsed: 1.200s
INFO [2020-02-28 12:05:03] 26904 Will pre-process for website id = 1, period = week, date = last1
INFO [2020-02-28 12:05:03] 26904 - pre-processing all visits
INFO [2020-02-28 12:05:04] 26904 Archived website id = 1, period = week, 0 segments, 0 visits in last 2 weeks, 0 visits this week, Time elapsed: 1.172s
INFO [2020-02-28 12:05:04] 26904 Will pre-process for website id = 1, period = month, date = last1
INFO [2020-02-28 12:05:04] 26904 - pre-processing all visits
INFO [2020-02-28 12:05:05] 26904 Archived website id = 1, period = month, 0 segments, 0 visits in last 2 months, 0 visits this month, Time elapsed: 1.231s
INFO [2020-02-28 12:05:05] 26904 Will pre-process for website id = 1, period = year, date = last1
INFO [2020-02-28 12:05:05] 26904 - pre-processing all visits
INFO [2020-02-28 12:05:06] 26904 Archived website id = 1, period = year, 0 segments, 0 visits in last 2 years, 0 visits this year, Time elapsed: 1.354s
INFO [2020-02-28 12:05:06] 26904 Archived website id = 1, 4 API requests, Time elapsed: 4.972s [1/1 done]
INFO [2020-02-28 12:05:06] 26904 Done archiving!
INFO [2020-02-28 12:05:06] 26904 ---------------------------
INFO [2020-02-28 12:05:06] 26904 SUMMARY
INFO [2020-02-28 12:05:06] 26904 Total visits for today across archived websites: 113
INFO [2020-02-28 12:05:06] 26904 Archived today's reports for 1 websites
INFO [2020-02-28 12:05:06] 26904 Archived week/month/year for 1 websites
INFO [2020-02-28 12:05:06] 26904 Skipped 0 websites
INFO [2020-02-28 12:05:06] 26904 - 0 skipped because no new visit since the last script execution
INFO [2020-02-28 12:05:06] 26904 - 0 skipped because existing daily reports are less than 900 seconds old
INFO [2020-02-28 12:05:06] 26904 - 0 skipped because existing week/month/year periods reports are less than 3600 seconds old
INFO [2020-02-28 12:05:06] 26904 Total API requests: 4
INFO [2020-02-28 12:05:06] 26904 done: 1/1 100%, 113 vtoday, 1 wtoday, 1 wperiods, 4 req, 5060 ms, no error
INFO [2020-02-28 12:05:06] 26904 Time elapsed: 5.060s
INFO [2020-02-28 12:05:06] 26904 ---------------------------
INFO [2020-02-28 12:05:06] 26904 SCHEDULED TASKS
INFO [2020-02-28 12:05:06] 26904 Starting Scheduled tasks...
INFO [2020-02-28 12:05:06] 26904 Scheduler: executing task Piwik\Plugins\CustomPiwikJs\Tasks.updateTracker...
INFO [2020-02-28 12:05:06] 26904 Scheduler: finished. Time elapsed: 0.001s
INFO [2020-02-28 12:05:06] 26904 Scheduler: executing task Piwik\Plugins\PrivacyManager\Tasks.deleteLogData...
INFO [2020-02-28 12:05:06] 26904 Scheduler: finished. Time elapsed: 0.001s
INFO [2020-02-28 12:05:06] 26904 Scheduler: executing task Piwik\Plugins\PrivacyManager\Tasks.anonymizePastData...
INFO [2020-02-28 12:05:06] 26904 Scheduler: finished. Time elapsed: 0.000s
INFO [2020-02-28 12:05:06] 26904 done
INFO [2020-02-28 12:05:06] 26904 ---------------------------
I'tt try to track down the next case in detail.
2500 tables is not quite normal, is it maybe a rather old installation? Possible there was a bug a while ago.
Not available for few hours doesn't sound normal. Is in the same time the result not available in the UI either?
Ah, sorry, it's only 107 tables, I looked at the wrong number. And the result is available in the UI while it's not correct in the reporting API at the same time. I'll try to find another case today to track that down.
Thanks @feikede . Is there maybe any caching active in your server set up or so? I'm asking because the UI is also just using the API and I can't think of a reason why the UI should be showing data while the API doesn't. Not sure what your server setup looks like or if your webserver sends maybe some caching headers.
No, there's no caching, it's a plain matomo LAMP install on CentOS 7.7. I have another case today (exactly now):
/?method=Contents.getContentNames&label=anwalt:2695&date=2019-01-01,today&period=range&format=JSON&module=API&idSite=1&format=JSON&token_auth=93..3
returns empty array [], while
/?method=Contents.getContentNames&segment=contentName==anwalt:2695&label=anwalt:2695&date=2019-01-01,today&period=range&format=JSON&module=API&idSite=1&format=JSON&token_auth=93..3
returns
[{"label":"anwalt:2695","nb_visits":10,"nb_impressions":12,"nb_interactions":2,"sum_daily_nb_uniq_visitors":10,"interaction_rate":"16,67\u00a0%","segment":"contentName==anwalt%3A2695","idsubdatatable":1}]
and
/?method=Contents.getContentNames&segment=contentName==anwalt:2695&label=anwalt:2695&date=last30&period=day&format=JSON&module=API&idSite=1&format=JSON&token_auth=93..3
returns
{"2020-02-04":[],"2020-02-05":[],"2020-02-06":[],"2020-02-07":[],"2020-02-08":[],"2020-02-09":[],"2020-02-10":[],"2020-02-11":[],"2020-02-12":[],"2020-02-13":[],"2020-02-14":[],"2020-02-15":[],"2020-02-16":[],"2020-02-17":[],"2020-02-18":[],"2020-02-19":[],"2020-02-20":[],"2020-02-21":[],"2020-02-22":[],"2020-02-23":[],"2020-02-24":[],"2020-02-25":[],"2020-02-26":[],"2020-02-27":[],"2020-02-28":[],"2020-02-29":[],"2020-03-01":[],"2020-03-02":[],"2020-03-03":[{"label":"anwalt:2695","nb_uniq_visitors":2,"nb_visits":2,"nb_impressions":2,"nb_interactions":0,"interaction_rate":"0\u00a0%","segment":"contentName==anwalt%3A2695","idsubdatatable":1}],"2020-03-04":[{"label":"anwalt:2695","nb_uniq_visitors":1,"nb_visits":1,"nb_impressions":3,"nb_interactions":2,"interaction_rate":"66,67\u00a0%","segment":"contentName==anwalt%3A2695","idsubdatatable":1}]}
while
/?method=Contents.getContentNames&label=anwalt:2695&date=last30&period=day&format=JSON&module=API&idSite=1&format=JSON&token_auth=93..3
returns
{"2020-02-04":[],"2020-02-05":[],"2020-02-06":[],"2020-02-07":[],"2020-02-08":[],"2020-02-09":[],"2020-02-10":[],"2020-02-11":[],"2020-02-12":[],"2020-02-13":[],"2020-02-14":[],"2020-02-15":[],"2020-02-16":[],"2020-02-17":[],"2020-02-18":[],"2020-02-19":[],"2020-02-20":[],"2020-02-21":[],"2020-02-22":[],"2020-02-23":[],"2020-02-24":[],"2020-02-25":[],"2020-02-26":[],"2020-02-27":[],"2020-02-28":[],"2020-02-29":[],"2020-03-01":[],"2020-03-02":[],"2020-03-03":[{"label":"anwalt:2695","nb_uniq_visitors":2,"nb_visits":2,"nb_impressions":2,"nb_interactions":0,"interaction_rate":"0\u00a0%","segment":"contentName==anwalt%3A2695","idsubdatatable":39}],"2020-03-04":[]}
The UI shows 2 Impressions on 3.3.20 (4.3 is missing here, and no impr. before march) - and it's totally different from above's "label":"anwalt:2695","nb_visits":10,"nb_impressions":12
So in summary, this seems pretty useless. Do you know what's the suggested way to count content impressions and API-report them in matomo?
@sgiehl @diosmosis maybe any idea right away how UI and API might report different results?
@feikede Does this only happen with dates that include today? Or does it happen w/ earlier dates? Does this affect the export link provided by the UI as well? Does it only affect ranges?
@tsteur I don't know why the UI would and API would report different results, seems very strange, but seems likely related to range archiving in some way.
Today
/?method=Contents.getContentNames&label=anwalt:2695&date=2019-01-01,today&period=range&format=JSON&module=API&idSite=1&format=JSON&token_auth=93e3..
still returns [], matching access_log entry is
2.x.y.180 - - [05/Mar/2020:08:41:45 +0100] "GET /?method=Contents.getContentNames&label=anwalt:2695&date=2019-01-01,today&period=range&format=JSON&module=API&idSite=1&format=JSON&token_auth=yy HTTP/1.1" 200 2 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:73.0) Gecko/20100101 Firefox/73.0"
@diosmosis
/?method=Contents.getContentNames&label=anwalt:2695&date=2019-01-01,2020-03-04&period=range&format=JSON&module=API&idSite=1&format=JSON&token_auth=93..
also returns []
While
/?method=Contents.getContentNames&label=anwalt:2695&date=last5&period=day&format=JSON&module=API&idSite=1&format=JSON&token_auth=93e3..
returns
{"2020-03-01":[],"2020-03-02":[],"2020-03-03":[{"label":"anwalt:2695","nb_uniq_visitors":2,"nb_visits":2,"nb_impressions":2,"nb_interactions":0,"interaction_rate":"0\u00a0%","segment":"contentName==anwalt%3A2695","idsubdatatable":39}],"2020-03-04":[{"label":"anwalt:2695","nb_uniq_visitors":1,"nb_visits":1,"nb_impressions":3,"nb_interactions":2,"interaction_rate":"66,67\u00a0%","segment":"contentName==anwalt%3A2695","idsubdatatable":145}],"2020-03-05":[]}
now.
@diosmosis How should I test the "export link" issue?
btw:
/?method=Contents.getContentNames&label=anwalt:2695&date=2020-03-03&period=day&format=JSON&module=API&idSite=1&format=JSON&token_auth=93e
returns
[{"label":"anwalt:2695","nb_uniq_visitors":2,"nb_visits":2,"nb_impressions":2,"nb_interactions":0,"interaction_rate":"0\u00a0%","segment":"contentName==anwalt%3A2695","idsubdatatable":39}]
maybe it's a period=range issue?
@feikede at the bottom of each graph there is an icon that can be used to get an API URL for a report:
clicking on it would show:
The link it generates should be directly to the API, I'm wondering if this feature is broken for you as well for content reports.
maybe it's a period=range issue?
It very well could be, though it's still extremely strange, and I'm not immediately sure what could be causing it.
Today, on 2020-03-31, I have the following buggy behavior:
/?method=Contents.getContentNames&label=anwalt:6122&date=2019-01-01,today&period=range&format=JSON
empty result
/?method=Contents.getContentNames&label=anwalt:6122&date=2019-01-01,2020-03-30&period=range&format=JSON
empty result
/?method=Contents.getContentNames&label=anwalt:6122&date=2019-01-01,2020-03-29&period=range&format=JSON
correct result ([{"label":"anwalt:6122","nb_visits":7,"nb_impressions":11,"nb_interactions":10,"sum_daily_nb_uniq_visitors":6,"interaction_rate":"90,91\u00a0%","segment":"contentName==anwalt%3A6122","idsubdatatable":844}])
/?method=Contents.getContentNames&label=anwalt:6122&date=2019-01-01,2020-03-28&period=range&format=JSON
correct result ([{"label":"anwalt:6122","nb_visits":7,"nb_impressions":11,"nb_interactions":10,"sum_daily_nb_uniq_visitors":6,"interaction_rate":"90,91\u00a0%","segment":"contentName==anwalt%3A6122","idsubdatatable":844}])
etc.
I give up using this now, it's not reliable. Thanks for your help
Sorry to hear @feikede . We would be maybe able to have a look should we have full server access (incl db) but otherwise this be hard to identify the issue. We can't possibly imagine how this is happening since Matomo uses the same API's in the reporting UI as well. It does somehow suggest there is some issue somewhere else in the server setup but of course there could be also just as well be some edge case issue in Matomo.
Think i have similar issue with 3.14.1:
/index.php?module=Contents&action=getContentNames&idSite=6&period=day&date=2020-09-30&segment=&showtitle=1&random=4231
and I got this:
Content Name | Visits | Unique visitors | Actions | Actions per Visit | Avg. Time on Website | Bounce Rate | Conversion Rate |
---|---|---|---|---|---|---|---|
Вася опубликовал новость от 13.13.2013 | 50% 3 | 50% 3 | - | - | 0s | 0% | 0% |
Публикация | 50% 3 | 50% 3 | - | - | 0s | 0% | 0% |
Totals | 100% 6 | 100% 6 | - | - | - | - | - |
Note on empty "actions" column
But with this /index.php?module=API&method=Contents.getContentNames&idSite=6&period=day&date=2020-09-30&segment=&format=HTML&token_auth=...
i got this:
label | nb_uniq_visitors | nb_visits | nb_impressions | nb_interactions | interaction_rate | _metadata | _idSubtable |
---|---|---|---|---|---|---|---|
Вася опубликовал новость от 13.13.2013 | 3 | 3 | 6 | 11 | 183.33% | 'segment' => 'contentName==%D0%92%D0%B0%D1%81%D1%8F+%D0%BE%D0%BF%D1%83%D0%B1%D0%BB%D0%B8%D0%BA%D0%BE%D0%B2%D0%B0%D0%BB+%D0%BD%D0%BE%D0%B2%D0%BE%D1%81%D1%82%D1%8C+%D0%BE%D1%82+13.13.2013' | 1 |
Публикация | 3 | 3 | 5 | 6 | 120% | 'segment' => 'contentName==%D0%9F%D1%83%D0%B1%D0%BB%D0%B8%D0%BA%D0%B0%D1%86%D0%B8%D1%8F' | 2 |
Note on nb_interactions
column (nevermind of too much interactions -- result of manual _paq.push('trackContentInteractions'...)
calls via js console for testing purposes)
As I can see, a call method getContentNames
via module Contents
returns different results with calling method Contents.getContentNames
from module API
. Is it normal? Or I miss something?
Or it's just something like a mess in column names in report UI (i.e. nb_actions
instead of nb_interactions
)?
fyi replied in https://github.com/matomo-org/matomo/issues/16505
This call randomly returns an empty [] json. I.e. currently:
/?method=Contents.getContentNames&label=anwalt%3A6122&date=2020-01-01,today&period=range&format=JSON&module=API&idSite=1&format=JSON&token_auth=93e...384
returns [] with headers
While the exact same call with segmentation by label returns a (the correct) result:
/?method=Contents.getContentNames&segment=contentName==anwalt%3A6122&label=anwalt%3A6122&date=2020-01-01,today&period=range&format=JSON&module=API&idSite=1&format=JSON&token_auth=93e...
Think this is a bug, or am I missing something?
BTW: This seems to be related to any kind of caching. Sometimes the call WITH segmentation delivers wrong results - sometimes without - and sometimes it's all gone one day later (have the cron running every hour, have about 10K req/day). Have latest V 3.13.3 installed.