EU-EDPS / website-evidence-collector

Project moved to https://code.europa.eu/EDPS/website-evidence-collector ! The tool Website Evidence Collector (WEC) automates the website evidence collection of storage and transfer of personal data. https://edps.europa.eu/press-publications/edps-inspection-software_en
https://code.europa.eu/EDPS/website-evidence-collector
European Union Public License 1.2
426 stars 73 forks source link

CSP blocks for google-analytics are ignored. - False positives? #85

Open StefanKBa opened 1 year ago

StefanKBa commented 1 year ago

Hi,

can it be that the tool flags google-analytics, even though the actual request is blocked via the server's CSP policy at the browser level?

Is this by design? - I'm wondering, as I'd imagine that companies might use the CSP policy approach as a quick mitigation for web sites with google-analytics. I don't recall an opinion by an EU Supervisory Authority on this, but would assume that using CSP would be a fair approach.. (also as web applications always rely on the interplay between server and browser)

Best regards

Stefan

StefanKBa commented 1 year ago

I was just checking against https://www.privacydesign.ch/ for the loading of a script from ajax.googleapis.com - which is actually blocked by the CSP.

In the inspection.yml the report contains - and from the debug messages below the finding on ajax.googleapis.com seems wrong: hosts: requests: firstParty:

However - looking at the debug messages: debug: Refused to load the stylesheet 'https://ajax.googleapis.com/ajax/libs/jqueryui/1.8.1/themes/base/jquery-ui.css?ver=6.0.2' because it violates the following Content Security Policy directive: "style-src 'self' 'unsafe-inline' https://fonts.googleapis.com". Note that 'style-src-elem' was not explicitly set, so 'style-src' is used as a fallback. {"timestamp":"2022-10-08T19:14:51.143Z","type":"Browser.Console"}

rriemann commented 1 year ago

Hello @StefanKBa ,

thank you for bringing this topic up. The accuracy of the output is indeed a primary concern and I agree that in many cases CSP allows for an effective and easy mitigation for many if not most of the customers.

However, relying on CSP makes assumptions on the browser support and possibly browser configuration of the customers.

The philosophy of the website evidence is to measure and not to judge. Hence, I suggest as a wanted behaviour to document when requests are subject to CSP blocks.

In the context, I like to point out that most of the internet industry is ignoring the customer configuration do-not-track.

Work on the WEC is likely eligible for EU funding from https://nlnet.nl/themes/ . Though I believe the project scope of solving one bug is too small given the necessary bureaucracy.

StefanKBa commented 1 year ago

Hi Robert,

we already have the Content-Security-Policy information in the requests.har.

As far as I know, all of the browsers that are actually in use today respect the CSP.

Do-Not-Track is different as it was never fully standardized. - Also, it is an "opt-out" signal. The successor to this is the "Global Privacy Control" signal, which has gained legal recognition in California's CCPA. The fact that Sephora didn't on that signal was part of the Sephora settlement.

Regards

Stefan

StefanKBa commented 1 year ago

Alternatively, I'd assume I'd end up writing a really bad python script to summarize the key info from a few hundred site scans that could also look at the Content-Security-Policy.

It would also be nice to have pre-configured cookie management framework cookies based on an initial scan...

Is there a chance to have a third party script contrib folder in the git? (Happy with EU license)

Regards

Stefan

rriemann commented 1 year ago

I guess it depends on the script and its complexity. Maybe the first step would be to filter the existing browser log and write back the CSP messages into the inspection.yml file.

StefanKBa commented 1 year ago

Hmm.. if you block google-analytics via the CSP, you actually see "Refused to load the script..." in the inspection-log.ndjson .. a few lines before it shows the match...

StefanKBa commented 1 year ago

There's actually quite a few things in a typcial inspection-log.ndjson that would make sense in the report. .. anything with 'message' matching 'Refused to'