xdvom03 / klaus

Bayesian text classification of websites in a nested class system
Creative Commons Zero v1.0 Universal
2 stars 0 forks source link

Consider redirects in crawlers #68

Closed xdvom03 closed 3 years ago

xdvom03 commented 3 years ago

Corollary of #57: if a site redirects, it should not be used for training (the target is preferred) - warn. Crawlers should also accept redirects, otherwise their domain hop/domain stay system can get messed up.

xdvom03 commented 3 years ago

The whole origin-url system now deals with this.