Closed GoogleCodeExporter closed 9 years ago
Thanks for reporting. Will publish the fix soon.
-Yasser
Original comment by ganjisaffar@gmail.com
on 5 Jan 2012 at 1:45
The fix is committed to the source repository:
http://code.google.com/p/crawler4j/source/detail?r=dbc9d3cb0d1efde4431f68b8417ee
2ed5d551a43
It will be included in the next release.
-Yasser
Original comment by ganjisaffar@gmail.com
on 6 Jan 2012 at 5:53
Some issues occured with URLCanonicalizer:
1. URLCanonicalizer added to my url equal sign "=", such as
http://somedomain.com/uploads/1/0/2/5/10259653/6199347.jpg?1325154037= but my
url was http://somedomain.com/uploads/1/0/2/5/10259653/6199347.jpg?1325154037.
2. When redirection happens, PageFetcher in fetchHeader method makes all urls
lowercase to redirected url.
Original comment by mansur.u...@gmail.com
on 8 Jan 2012 at 4:43
[deleted comment]
this is not fixed.
Original comment by alexnosp...@gmail.com
on 2 Feb 2013 at 11:05
Do a simple System.out.println(page.getWebURL().getURL()); to see what url
crawler4j is visiting. so bad.
Original comment by alexnosp...@gmail.com
on 2 Feb 2013 at 11:06
java.lang.StringIndexOutOfBoundsException: String index out of range: -8 when
you do CrawlNow. seems crawler4j is completely unusable for me now.
Original comment by alexnosp...@gmail.com
on 3 Feb 2013 at 1:02
This issue was closed by revision 183d98a269db.
Original comment by ganjisaffar@gmail.com
on 3 Mar 2013 at 7:08
Original issue reported on code.google.com by
tahs...@trademango.com
on 5 Jan 2012 at 12:10