Closed snsb-seifert closed 4 years ago
The crawler fails on datasets which try to redirect the crawler with http status 308 (https://tools.ietf.org/html/rfc7538)
Cause: Apache HTTPClient used in gbif.httputil does not know about status 308 used to crawl in https://github.com/gbif/crawler/blob/a2ebaaac77448c0045948f0d8c0a0ee84b642bd6/crawler-cli/src/main/java/org/gbif/crawler/common/DownloadCrawlConsumer.java#L55
Solution: Implement a new redirectStrategy to support HTTP status 308 in HTTPUtil https://github.com/gbif/gbif-httputils/blob/3af4fc6d6670f2e6507c245a3c11662e6ac04815/src/main/java/org/gbif/utils/HttpUtil.java#L204
or switch to Apache HTTPClient Core 5 which knows about status 308:
https://hc.apache.org/httpcomponents-core-5.0.x/httpcore5/apidocs/org/apache/hc/core5/http/HttpStatus.html
The crawler fails on datasets which try to redirect the crawler with http status 308 (https://tools.ietf.org/html/rfc7538)
Cause: Apache HTTPClient used in gbif.httputil does not know about status 308 used to crawl in https://github.com/gbif/crawler/blob/a2ebaaac77448c0045948f0d8c0a0ee84b642bd6/crawler-cli/src/main/java/org/gbif/crawler/common/DownloadCrawlConsumer.java#L55
Solution: Implement a new redirectStrategy to support HTTP status 308 in HTTPUtil https://github.com/gbif/gbif-httputils/blob/3af4fc6d6670f2e6507c245a3c11662e6ac04815/src/main/java/org/gbif/utils/HttpUtil.java#L204
or switch to Apache HTTPClient Core 5 which knows about status 308:
https://hc.apache.org/httpcomponents-core-5.0.x/httpcore5/apidocs/org/apache/hc/core5/http/HttpStatus.html