hbz / lobid

Linking Open Bibliographic Data
https://lobid.org/
Eclipse Public License 2.0
15 stars 4 forks source link

Archive more rows in Matomo #402

Closed acka47 closed 4 years ago

acka47 commented 4 years ago

Initiated by this Twitter thread, I checked what kind of searches are run against lobid-gnd and especially what searches result in an exit. As only 500 searches are shown and downloadable, this does not make much sense.

We will have to increase the number of archived pages, see https://matomo.org/faq/how-to/faq_54/.

So, we will have to adjust the following in config/config.ini.php (let's also set up the number of archived search engines and keywords) increasing the numbers to 100000 I suggest:

; maximum number of rows for any of the Referers tables (keywords, search engines, campaigns, etc.), and Custom variables names
datatable_archiving_maximum_rows_referrers = 100000
; maximum number of rows for any of the Referers subtable (search engines by keyword, keyword by campaign, etc.), and Custom variables values
datatable_archiving_maximum_rows_subtable_referrers = 100000
; maximum number of rows for the Site Search table datatable_archiving_maximum_rows_site_search = 100000
dr0i commented 4 years ago

I doubt that this is a good idea, see https://matomo.org/faq/how-to/faq_54/. The performance reason argument seems to make it unlikely that an increase of factor 200 don't make it break. But of course we could just try it.

acka47 commented 4 years ago

Ok, let's start with ~10,000~ 5,000 then.

dr0i commented 4 years ago

Note: choose "lobid.org/gnd" , then "behaviour->SiteSearch". Atm 1-500.

dr0i commented 4 years ago

The "referrer" table variables are increased, see Diagnostic->Config_File and search "referrer". Let's see if this indeed helps - the data of this month will be calculated at the 2th of the next month. (Deliberately not re-calculating the old data).

dr0i commented 4 years ago

I can now see 5k entries, so it seems to work.

dr0i commented 4 years ago

Just a note, of course the 5k entries are only available starting with September 2019.

acka47 commented 4 years ago

It works. Closing. Regarding the idea from the twitter thread in the original issue. One part of the queries are variantName queries that probably come from the NWBib Themensuche. These could easily be used to check for missing variant names. Regarding the other queries, we would have to do a lot more work to filter out relevant queries that could be used for improving coverage of variantNames.