privacy-tech-lab / gpc-web-crawler

Web crawler for detecting websites' compliance with GPC privacy preference signals at scale
https://privacytechlab.org/
MIT License
3 stars 1 forks source link

Issue 54 - Changes made during the validation set #75

Closed katehausladen closed 6 months ago

katehausladen commented 6 months ago

Changes: (issue #54)

1) add functionality to screenshot error sites. This helps diagnose the error beyond the error classification 2) updating how urlClassification is stored. Now, it is stored based on the last committed url 3) updating how USPS/GPP data is processed for the case of the api responding but returning null

I've already tested this extensively because it was used for the crawl.