EFForg / badger-sett

Automated training for Privacy Badger. Badger Sett automates browsers to visit websites to produce fresh Privacy Badger tracker data.
https://www.eff.org/badger-pretraining
MIT License
121 stars 15 forks source link

Check domains with PyFunceble #23

Closed bcyphers closed 6 years ago

bcyphers commented 6 years ago

Use the PyFunceble tool to check domains for validity before visiting them. Clear pyfunceble cache whenever we refresh the majestic list.

21

bcyphers commented 6 years ago

TODO: turn off PyFunceble's automatic "log sharing:" https://pyfunceble.readthedocs.io/en/latest/logs-sharing.html

bcyphers commented 6 years ago

Using this for the top 2k sites, PyFunceble finds 88 of them (4%) anything other than "Active." There are still several sites that PF finds "Active" but that return DNS errors when we try to browse to them in the scanner. Looks like this will be a partial solution at best.