brave / brave-core-crx-packager

Mozilla Public License 2.0
41 stars 35 forks source link

Look for corruption in downloaded files #806

Open fmarier opened 10 months ago

fmarier commented 10 months ago

https://community.brave.com/t/ads-and-trackers-blockers-breaks-all-websites/524842 reported widespread webcompat problems. This was due to a corrupted cookie list. We should be validating filter lists after downloading them. List of things to check (in priority order):

In all these cases, we should not publish a new version of the list/component and send a Sentry alert like how it's done already in the code: https://github.com/brave/brave-core-crx-packager/blob/master/scripts/generateAdBlockRustDataFiles.js#L101

ryanbr commented 10 months ago

https://github.com/ryanbr/fanboy-adblock/blob/master/scripts/validateChecksum.pl also

fmarier commented 10 months ago

Example of data corruption:

singkinderlieder.de##+js(set-cookie, acceptMatomo, true)

became:

singkinderlieder.de-o+js(set-clert, acceptMatomo, true)
fmarier commented 10 months ago

For added robustness, it may be worth verifying the checksum both when downloading the upstream file / building the component, as well as client-side when loading the list.

ShivanKaul commented 10 months ago

From checking S3, it looks like the corrupted file was shipped to clients. To be safe though, we should check the list before loading into brave-core, since the corruption might happen as part of the CRX zipping process.

ShivanKaul commented 10 months ago

I edited the top-level issue description to have list of validation checks we can do.

bbondy commented 10 months ago

I added an item to the top level issue (item 0) "Verify checksums when available at crx build time in a place similar to https://github.com/brave/adblock-resources/blob/master/index.js#L7"