juhanurmi / ahmia

Ahmia hidden service search engine
BSD 3-Clause "New" or "Revised" License
198 stars 64 forks source link

Another crawler to search .onion links from the public Internet #5

Closed juhanurmi closed 10 years ago

juhanurmi commented 10 years ago

Use an another crawler to search .onion pages from the public Internet. Search new .onion domains from different online sources. Ask help from organizations that are crawling. This is an excellent case to test open source crawlers like Heritrix and Apache Nutch? Or use the search engines that exist.

2 workweeks

juhanurmi commented 10 years ago

Heritrix and Apache Nutch are totally overkill for this. There are few good sites that list onion addresses. I will fetch new URLs daily from these sites.

Furthermore, with Tor2web integration I am downloading the visit history from each Tor2web nodes and this seems to be the best way to find new onions.