Closed Geal closed 3 years ago
the query was generated with a call to a URL with this format:
https://domain/index.php?date=2020-12-13,2021-01-11&expanded=1&filter_limit=-1&format=CSV&format_metrics=1&idSite=1&language=en&method=API.get&module=API&period=day&token_auth=<token>&translateColumnNames=1
Hi,
I think the code responsible for this is the following: https://github.com/matomo-org/matomo/blob/c870770157a3e9c893308967dc274c8feac5d4be/core/DataTable/Renderer/Csv.php#L311-L321
It just generates a nice filename and then puts it into the header without caring about the right encoding.
If you have an idea how this could be fixed, it would be great if you could create a PR.
one of our clients uses matomo and some HTTP responses fail when downloading reports,
Hi @Geal does the download not work at all in this case? Do you know what browser is being used there or is it maybe some server that fetches the file?
Downloads failed because the HTTP response went through our reverse proxy which rejects invalid headers. It is independent of the browser that is used, it can even be reproduced with a curl command.
The utf-8 data can be transformed properly with rawurlencode: https://stackoverflow.com/a/25704866
The code should know the actual encoding of the filename (ascii, iso 8859 1, utf-8 or others) and specify it in the header. Are there any guarantees on the encoding used in matomo?
AFAIK the encoding should be UTF8 or UTF 16 (there should be some parameter to request data in UTF 18). Since it seems like an easy fix will schedule this issue. Cheers @Geal
fixed by #17276
Hello, one of our clients uses matomo (I do not know which version exactly), and some HTTP responses fail when downloading reports, because of charset issues in the Content-Disposition header. Here's a hex dump of one of those responses:
right after "2020", there's the
–
character, which is an en dash encoded ase2 80 93
in UTF8.According to https://tools.ietf.org/html/rfc6266#section-4, when using the
filename=""
format, the name between double quotes should be (https://tools.ietf.org/html/rfc2616#section-2.2) in ISO-8859-1 charset, or in RFC 2047 format, like this:=?iso-8859-1?q?this is some text?=
(for what it's worth, I never see anything in that format lately)If the filename must include UTF-8 characters, it should use the
filename*=""
option, like this:UTF-8''%c2%a3%20and%20%e2%82%ac%20rates
(cf https://tools.ietf.org/html/rfc5987#section-3.2.2 ) (the exact format is defined in https://tools.ietf.org/html/rfc5987#section-3.2.2 )Unfortunately, I do not control this deployment of matomo, so my ability to test patches is limited, but I can request further information.
maybe related to #9580