Closed sebastian-nagel closed 8 years ago
For the WARC file the problem is caused by the following CSS snippet
#services .avia-logo-element-container img {
filter: url(\"");
filter: none;
-webkit-filter: none;
}
The length check in the method patternCSSExtract is insufficient: if 4 characters are removed the URL must be at least 4 characters long:
} else if (url.charAt(0) == '\\') {
if(url.length() == 2)
continue;
url = url.substring(2, origUrlLength - 2);
The WEATGenerator chokes on some WARC fails and fails with a StringIndexOutOfBoundsException thrown by ExtractingParseObserver.