qurbat / blocked-hosts

A periodically updated list of websites known to be blocked in India
Creative Commons Zero v1.0 Universal
78 stars 10 forks source link

Provide input.txt file #5

Open captn3m0 opened 2 years ago

captn3m0 commented 2 years ago

The input.txt file (Combined list of hosts to check) is not committed in the repo, and makes it hard to reproduce this work for other providers.

captn3m0 commented 2 years ago

Planning to re-run tests on multiple ISPs, any chance this can be published?

qurbat commented 2 years ago

The input is a combination of several, significantly large domain lists. I will put together a bash script that pulls and normalises the files (remove duplicates, sub domains) and create a PR for it.

Jrchintu commented 2 years ago

The input is a combination of several, significantly large domain lists. I will put together a bash script that pulls and normalises the files (remove duplicates, sub domains) and create a PR for it.

Please consider keeping eye on alexa list, it has been shut down on 1 may (https://alexa.com)

Also, consider checking https://tranco-list.eu/ they shortlist many top1m lists.

qurbat commented 2 years ago

@captn3m0 I've created a draft PR for this. I'll keep you updated on the status.

captn3m0 commented 1 year ago

Following up on this. I have access to a few other ISPs for the next few weeks, so want to re-run scans with an updated input list.