Closed ctwardy closed 7 years ago
Current code removes HTTP errors from results. Don't. Our main use case now involves cached HTML from SiteHound -- if that's bad we want to classify it as an error page.
Hence: include rules for labeling "error".
Fixed in #13. But not fully integrated.
Current code removes HTTP errors from results. Don't. Our main use case now involves cached HTML from SiteHound -- if that's bad we want to classify it as an error page.
Hence: include rules for labeling "error".