What steps will reproduce the problem?
Download data from a website when the content-type header is complex.
This header would work in the library
Content-Type: application/rss+xml; charset=utf-8
This header won't work
Content-Type: application/rss+xml; charset=utf-8; filename=rssfeed.xml
What is the expected output? What do you see instead?
You expect the encoding to be detected as UTF-8 but it's detected as utf-8filenamerssfeed.xml
What version of the product are you using? On what operating system?
V0.24.3
Please provide any additional information below.
Changing the parseCharset method to this solves the problem, it looks for a semicolon after the charset and limits the read up until that point if it exists.
private String parseCharset(String tag) {
if (tag == null)
return null;
int i = tag.indexOf("charset");
if (i == -1)
return null;
int e = tag.indexOf(";", i) ;
if (e == -1) e = tag.length();
String charset = tag.substring(i + 7, e).replaceAll("[^\\w-]", "");
return charset;
}
Original issue reported on code.google.com by roxbur...@gmail.com on 13 Dec 2012 at 11:14
Original issue reported on code.google.com by
roxbur...@gmail.com
on 13 Dec 2012 at 11:14