matomo-org / matomo

Empowering People Ethically with the leading open source alternative to Google Analytics that gives you full control over your data. Matomo lets you easily collect data from websites & apps and visualise this data and extract insights. Privacy is built-in. Liberating Web Analytics. Star us on Github? +1. And we love Pull Requests!
https://matomo.org/
GNU General Public License v3.0
19.95k stars 2.66k forks source link

Removing default sanitizing of all API params and request vars #11786

Open tsteur opened 7 years ago

tsteur commented 7 years ago

To prevent issues like https://github.com/piwik/plugin-CustomDimensions/issues/62

I would say the current way of sanitizing all input is rather an anti pattern and causes lots of bugs.

We need to check if we can remove this behaviour. Problem is that some functionality may not work anymore 100% and we need to make sure that all values are properly escaped when using them to not run into any security issues after removing it. This will be lots of work and need to see if we can manage this in Piwik 4.

sgiehl commented 7 years ago

Maybe we could also try to make that step by step... we currently use Common::getRequestVar() to get anything from the request object. We could introduce a new method directly returning the original request value (not sanitized) and start using that new method everywhere. So we could already deprecate the Common::getRequestVar() and remove it in Piwik 4 completely

tsteur commented 7 years ago

That's a great idea! For API there is likely not a workaround possible for now. We could do something crazy and allow underscores like $_id which would equal $id but unsanitized and later simply switch it the other way around in Piwik 4 but not sure if it's a good idea or not. Be definitely good to do this step by step to avoid a big refactoring and big PR that takes ages

sgiehl commented 7 years ago

not sure if we are using request vars with an underscore anywhere else than here, but that likely might cause problems when using such a "magic" thing. So maybe introducing a new method might be easier, but not sure re naming

tsteur commented 7 years ago

Sorry I meant API parameters, not in getRequestVar. Any API parameter is automatically sanitized. For getRequestVar we should definitely use new method.

sgiehl commented 2 years ago

@justinvelluppillai This would also be a nice to have for Matomo 5. At least changing the input sanitizing for API would be a breaking change. Changing the use of Common::getRequestVar() can be done without breaking changes, but we could consider deprecating that method for Matomo 5 maybe and introduce a replacement.

Note: This issue might include some parts of #4231

tsteur commented 2 years ago

@sgiehl what did you have in mind there? Changing the input sanitizing for the API? And introducing a new method as an alternative to getRequestVar()? Or do you mean only adding a different method?

How do we guarantee there won't be any security regression and double escaping issue there and how much work would that one be in total?

sgiehl commented 2 years ago

@tsteur No, actually both. For the API I was thinking about switching the default behavior to have no sanitizing, but for the "migration" provide a static list of methods / plugins that would still use sanitizing. That way we could in the beginning add all API methods to the list and then remove one after the other, after checking each. Which we could then also do step by step in ongoing releases.

tsteur commented 2 years ago

@sgiehl how would that affect plugins? And how much work would it be for the work in Matomo 5 and how much work would it be to complete this work fully approximately?

sgiehl commented 2 years ago

@tsteur I had some further thoughts on that one. My suggested migration plan would be this one:

For plugins that would mean, that with Matomo 5, we need to set the property to true, if we don't want to invest some time to check for potential problems yet.

The mentioned changes above should be done quite quickly. Looking through all existing APIs and disabling the auto sanitizing is hard to estimate. Some APIs should be quite simple to check, while others might be harder. I would suggest to create issues for each API once this migration plan is confirmed and then start to plan in only a couple of APIs for each release.

tsteur commented 2 years ago

@sgiehl what would be the rough estimate for the entire project? We would need some estimation. And by having this, how much time can it save us over the next 5 years?

sgiehl commented 2 years ago

@tsteur The first step, to introduce the flag and setting it in all our plugins (for Matomo 5), shouldn't be more than half a day. Checking all our APIs, removing the auto sanitize and checking if that might have opened up any security risk is a bit harder to estimate. APIs that only have reporting methods should be easy to change. But APIs like Goals, SitesManager, UsersManager,... where we store user input take a bit longer, as removing the sanitize might change new values in the database and we might even need to update existing ones, to have a consistent data set. Without looking deeper into details I would say for core (and submodules) this might take at least 2 weeks. But this will help us to make our code more consistent, stable and easier to understand.

How much time that can save us is hard to say. In the past 5 years there had been a plenty of issues around problems with double encoded stuff and I actually fixed one 2 weeks ago and doing another one at the moment.

mattab commented 1 year ago

Next step is to document this so it's clear in the future for engineers: https://github.com/matomo-org/developer-documentation/issues/669

Q: Can we close this issue? (and in the future the team would refactor as needed)

sgiehl commented 1 year ago

@mattab We have already introduced the possibility to change the behavior where needed. For Matomo 5 I guess this should be enough, as we can start changing it where needed. I would suggest to move this issue to Matomo 6, so we can consider changing the default behavior there.