Closed GoogleCodeExporter closed 8 years ago
Check the forum for the discussion over this encoding topic.
Original comment by sjdir...@gmail.com
on 8 Jul 2013 at 2:56
title should be "encoding of each page"
Original comment by sjdir...@gmail.com
on 8 Jul 2013 at 3:12
miljours@.....com
if found this solution when a have a page in charset: iso-8859-1.
//Fixed when latin charset detection into the stream.
using (StreamReader sr = new
StreamReader(response.GetResponseStream(),Encoding.GetEncoding(response.Characte
rSet)))
this is in the PageRequester.cs line 131 i don't know if is useful to share.
Original comment by sjdir...@gmail.com
on 18 Jul 2013 at 4:19
Original comment by sjdir...@gmail.com
on 3 Sep 2013 at 1:49
Added auto encoding and CrawledPage.Content.Bytes which should allow data to be
writtent to file stream without corruption.
Original comment by sjdir...@gmail.com
on 17 Sep 2013 at 2:43
[deleted comment]
I'm having errors with the encoding.
How can i tell my crawler to use Encoding.UTF7 as the encoding type?
Original comment by TysH...@gmail.com
on 16 Oct 2013 at 11:17
This is not released yet, in the meantime you can read these two for
workarounds...
https://groups.google.com/forum/#!topic/abot-web-crawler/lIGxg0oPmTc
https://groups.google.com/forum/#!topic/abot-web-crawler/-U9MDiSBbGM
Original comment by sjdir...@gmail.com
on 17 Oct 2013 at 6:57
Issue 123 has been merged into this issue.
Original comment by sjdir...@gmail.com
on 3 Jan 2014 at 3:06
Original issue reported on code.google.com by
sjdir...@gmail.com
on 8 Jul 2013 at 2:55