matomo-org / matomo

Empowering People Ethically with the leading open source alternative to Google Analytics that gives you full control over your data. Matomo lets you easily collect data from websites & apps and visualise this data and extract insights. Privacy is built-in. Liberating Web Analytics. Star us on Github? +1. And we love Pull Requests!
https://matomo.org/
GNU General Public License v3.0
19.72k stars 2.63k forks source link

AnonymizeIp: introduce new hook for masking the IP at tracker runtime #2095

Closed peterbo closed 12 years ago

peterbo commented 13 years ago

For the anonymizeIP-Plugin, it is not only necessary to mask the IP for storing in the DB but also for the user recognition heuristics and all other plugins / tools / components that use the visitor IP before it is masked.

Since it must be masked very early at runtime to be sure, that only the shortened IP is used for all correlations, we need a new hook.

// (proposal from matt)
// Hook to modify the IP 
Piwik_PostEvent('Tracker.Visit.setVisitorIp', &$this->visitorInfo['location_ip']);

Any suggestions on this?

robocoder commented 13 years ago

Once the masked ip is stored, it's not useable in the heuristics match.

Live and GeoIP are the only plugins that look up the IP. Live queries the DB. GeoIP is being moved into core, but can be disabled like any other plugin.

Are there any other specific use cases?

peterbo commented 13 years ago

Anthon, I'm not quite sure if I get your point. Do you mean that the heuristics use the full IP-Address but do not match against the IP which is stored in the DB (since it is masked)?

I only want to make sure, that we find a consitent solution for being sure, that no component (if core or plugin) can't use the full IP-Address if the anonymizeIp-Plugin is active. To comply with the privacy laws, not only storing the full IP is against the terms but also calculations / computations / Geolocalization / etc at (tracker) runtime.

So I'm stating this for discussion: If I activate the anonymizeIP-Plugin, I want to be sure that nothing is contrary to any privacy laws. A standard (not technical user) is not aware of the fact that he would also have to disable the GeoIP-Plugin to be absolutely safe.

Are there arguments for / against this statement?

mattab commented 13 years ago

I think that only "IP Exclude" feature requires the full IP (but not stored anywhere, since the visit is ignored). Must have for Data and user Privacy

mattab commented 13 years ago

See proposal for User Privacy unified plugin #2233

robocoder commented 13 years ago

The new config setting will be: ip_address_pre_mask_length

We can implement the unified Privacy plugin after I check in this enhancement (along with the IPv6 changes).

robocoder commented 13 years ago

(In [4533]) fixes #1111 - add support for IPv6 addresses (tracking, anonymization, and exclusion) fixes #2095 - add new anonymization hook (pre-heuristics) fixes #2055 - optional IP filter when multiple proxies present fixes #1775 - SitesManager: supports CIDR notation for IP exclusion

Notes:

select inet_ntoa(conv(hex(location_ip), 16, 10)) from piwik_log_visit;
peterbo commented 12 years ago

(In [5772]) Refs #2233, #2095, #2902 - set ip_address_mask_length and ip_address_pre_mask_length on anonymizeIP-plugin activation. Synchronize both variables on PrivacyManager call.