matomo-org / matomo

Empowering People Ethically with the leading open source alternative to Google Analytics that gives you full control over your data. Matomo lets you easily collect data from websites & apps and visualise this data and extract insights. Privacy is built-in. Liberating Web Analytics. Star us on Github? +1. And we love Pull Requests!
https://matomo.org/
GNU General Public License v3.0
19.9k stars 2.65k forks source link

[Bug] Sorting Product Revenue in Ecommerce not working #21829

Closed enual closed 9 months ago

enual commented 10 months ago

What happened?

When a customer click Product Revenue to sort it out from low to high, what's being sorted alphabetically is the product name.

What should happen?

When you click Product Revenue in Ecommerce, it should be sorted numerically and not the Product name. image

How can this be reproduced?

This can be reproduce by logging in to Matomo > Ecommerce > Product.

Matomo version

5.0.1

PHP version

8.2.14

Server operating system

Ubuntu 22.04

What browsers are you seeing the problem on?

Chrome

Computer operating system

Windows

Relevant log output

No response

Validations

bx80 commented 10 months ago

Hi @enual,

I was unable to create this on demo.matomo.cloud or locally running 5.0.1. Has it just been reported by a single user or have you been able to recreate the issue too?

enual commented 10 months ago

@bx80 this customer provided their Matomo instance access and was able to replicate it while logged in to their dashboard.

peterbo commented 10 months ago

This has been reported before:

And I also have an instance, where this sorting is behaving in a weird way.

bx80 commented 9 months ago

Since this has been reported several times and we've not managed to reliably reproduce it, I think we should schedule some time for a detailed investigation to find out what is happening. I'll request a priority assessment from the product team :+1:

sgiehl commented 9 months ago

@peterbo @enual As you are able to reproduce that on an instance, could you maybe check if the sorting is already incorrect when using the API directly for those reports?

peterbo commented 9 months ago

Hi @sgiehl - yes, it's already returned from the API in a wrong order:

Call URI: /index.php?module=API&method=Goals.getItemsName&expanded=1&idSite=7&period=day&date=yesterday&abandonedCarts=0&format=JSON&token_auth=XXX&force_api_session=1&filter_sort_column=nb_visits&filter_sort_order=desc

api-response-ecommerce-order

sgiehl commented 9 months ago

@peterbo Thanks for this. That makes it a bit easier where to look for a possible problem. Maybe for some weird reason the filter_sort_column isn't taken into account 🤔

sgiehl commented 9 months ago

@peterbo I think I might have found a possible reason. Is the instance that has the problem rather old and was set up before Matomo 4.0? Furthermore is the CustomVariables plugin installed and ecommerce metrics were tracked as custom variables before - or maybe still are for some reason?

peterbo commented 9 months ago

Hi @sgiehl! I'm not really knowledgeable about the track history of this specific instance (the contact person on the client's side as well) and don't have any FTP access. I can only see these infos in the system check:

Matomo Version | 4.12.2 Matomo Update History | 4.11.0,4.10.1,4.9.1,4.7.0,4.6.2, Matomo Install Version | 4.6.2

CustomVariables Plugin is not installed.

peterbo commented 9 months ago

It might be connected with columns, that do not have any values in all rows, e.g. if a product was sold without any prior visits to its product page:

ecommerce-sort

but it's hard to really do strcutured testing with that. At least I can say that sorting the nb_visits column works, if all rows have integer values.

**UPDATE: No, that's probably not the case. Other days, also having non-defined row values, sort in a correct way.

sgiehl commented 9 months ago

Hm. ok. I thought it might be caused by this special handling that is only applied for installs that were set up prior to Matomo 4: https://github.com/matomo-org/matomo/blob/6199cd6f6cad3a9e3cd44649d40fa36b6b84dc9f/plugins/Goals/API.php#L364-L366

Guess I need to look for another possible reason then.

sgiehl commented 9 months ago

It might be connected with columns, that do not have any values in all rows, e.g. if a product was sold without any prior visits to its product page:

Actually that could be the case. It looks like the archives that are built might not contain certain metrics for some products. So if no product views were tracked for a product, there might not be a nb_visits column for those records.

My assumption would be that sorting works correctly if the first row of an archive has all columns. Those archives should be pre-sorted by revenue. So if the record with the highest revenue has some visits tracked, the sorting should work. If that record has no value for e.g. visits, the sorting by this column will fail. Based on the internal logic I think it would skip sorting by that column and fall back to the label, as the requested sorting column can't be found in the (first row of the) table.

@peterbo Could you maybe check that on your instance?

sgiehl commented 9 months ago

Ok. I finally found the reason why this is happening. The archiving of ecommerce items works in a way where different metrics are fetched based on certain dimensions. The results of such queries are then combined together into one table. Due to the way that is done, it can happen that a certain item record ends up with missing metrics. If e.g. no item views were tracked and table row might not have an entry for nb_visits.

In addition most of the metrics are internally stored with metric ids instead of their names.

When tables are sorted, the sorter checks if the provided metric to sort by exists in the first row. If this isn't the case the metric name is converted to the metric id and checked again. If it works the metric id will be used instead.

So a problem only occurs when the first row of a table does not have the metric it should be sorted by. In this special case the metrics are still stored with their metric ids in some of the rows. As the first row doesn't have that metric at all, the checks for metric name and metric id are failing and it continues using the metric name. So it basically sorts by a metric that does not exist in any row, as the renaming from id to name is done in a later step.

Imho there are multiple ways to fix this:

peterbo commented 9 months ago

That sounds like a quite valid explanation for what's happening. Thank you, @sgiehl for the effort, you've put into finding this!