medialab / hyphe

Websites crawler with built-in exploration and control web interface
http://hyphe.medialab.sciences-po.fr/demo/
GNU Affero General Public License v3.0
328 stars 59 forks source link

Distinguish counts of crawled pages between successes and errors #425

Closed boogheta closed 2 years ago

boogheta commented 2 years ago

Currently a crawl status counts a total of pages which includes errors. So when a webentity does not exist on the web for instance, its crawl will still say 1 crawled page (or even N if given multiple startpages) although it's a 404 or 500. It would make more sense for the users to see the number of well crawled pages (and maybe also display a number of error pages)