eldy / AWStats

AWStats Log Analyzer project (official sources)
https://www.awstats.org
369 stars 120 forks source link

[GDPR] Option in conf that anonymize IP addresses (/16) #110

Open Commifreak opened 5 years ago

Commifreak commented 5 years ago

Hi,

to comply the GDPR, it is necessary for me, to anonymize the visitors IP addresses. There are several scripts that do this stuff at Apache level, but I want to save the original data 2 weeks.

Is there a way to tell awstats while it grabs log files to anonymize the collected addresses with a /16 mask?

Thanks in advance!

visualperception commented 5 years ago

No, but awstats is open source so if you are a perl programmer, then you can make it do what ever you want. Alternatively you can put in a request for an update but I wouldn't expect personal requirements to be added unless enough other people support the request which I strongly doubt.

IP number is not directly traceable to the vast majority of users. It is traceable only to their ISP, at least thats the way it works in the UK. Most but not all get dynamically allocated IPs which are also only traceable to the ISP and change every time they close their connection to their ISP or possibly more often. Static IPs are also only traceable to the ISP (in the UK). The problem is that most web designers/builders don't understand how IPs really work. And if you make your webstats sit behind a password protected entry, then they are only accessible to trusted users which is your simplest solution. And anyway, if you make it available for 16 days you are breaking your own invalid rules. So in short you are worrying about something you are not at risk of creating. Its called security paranoia. My users don't have access to their webstats, They don't know/understand how to interpret them and don't want to waste their time looking at them anyway. I keep them informed whats happening if they want to know. If yours are running web stores, then I think there are far better free solutions than Awstats. Dare I suggest google analytics and they can waste their time to their hearts content looking through them without really understanding what they are looking at..

j-schumann commented 5 years ago

No offence, but It's not security paranoia, it's the law. GDPR compliance is required for every host in the EU. IP addresses are personal data: https://gdpr-info.eu/issues/personal-data/, at least in the eyes of justice

Personal data may be stored as long as it is required to fulfill the intended service (e.g. blocking malicious users) but must be deleted as soon as possible, Art 5. GDPR, 1 (e) - https://gdpr-info.eu/art-5-gdpr/ + Art 6. GDPR, 1 (f) - https://gdpr-info.eu/art-6-gdpr/

So keeping the IP addresses for a restricted period is perfectly valid (e.g. if you want to use a tool like fail2ban etc.) and not "breaking your own invalid rules", but anonymisation or deletion is required. Thus I'ld like to see the option implemented @Commifreak suggested.

visualperception commented 5 years ago

do a whois on any of your users ip number and tell me if it points to them or their ISP? If it points to the ISP then the only risk to the end user is if the ISP shares their IP. It therefore falls to the ISP to protect the IP number and if they are an EU company they will have to comply with same GDPR rules as you and I. So is that IP really the end users IP or is it the ISPs IP ? If you know anything about how IPs work you would realise that ISPs don't modify IANA database to point to end users so the IP belongs to the ISP. So whos risk is it? And since most of them are dynamically allocated IPs what use are they to anyone trying ANALyse their usage. And the static ones mostly point to ISPs too. And the few that don't generally point to larger companies who buy their own blocks of IPs. Like I said, most people don't understand IPs and that includes the numpties who made the GDPR law.

If you want to protect your users then make it THEIR responsibilty to block third party cookies or change the law to stop browsers from allowing third party cookies altogether and then there is no problem since firrst party cookies and IPs can't point to a user unless they agree to provide data. And the web developers can control what goes in first party cookies. The IPs in log files are collected by webhosts and most people use external webhosts like amazon and countless others. If you use an EU webhost then they are bound by the same GDPR rules as you and I and it falls to them to deal with it. If you are taking the data and storing it locally then it falls to you but what risk are you as a small designer/developer who doesn't even know that IPs don't even point to end users aren't owned by the users? Easy to prove in court.

This all about governments wanting to take control of the web and job creation schemes. Create a minister in charge of data security and he/she has to been seen to be doing something. This usually means hiring a third party agency who want to make a name for themselves and voila, we get GDPR.