Closed GoogleCodeExporter closed 9 years ago
Hi,
Would you please let me know which version are you using? I tried this domain
and couldn't reproduce this bug. There has been bugs in WebURL that I have
fixed in the latest version.
Thanks,
Yasser
Original comment by ganjisaffar@gmail.com
on 5 Mar 2012 at 6:52
I've seen this issue since a striaght upgrade to 3.3 as well.
Original comment by DarenDa...@gmail.com
on 6 Mar 2012 at 11:15
I also experience this bug with 3.3.
Original comment by try6...@gmail.com
on 16 Aug 2012 at 8:40
I believe this is caused when the url is empty.
I would add some validation of the url, to make sure it's not null or empty,
and also some checks related to the value of domainEndIdx, to make sure it's
not negative or smaller than domainStartIdx, which would cause the substring
command to fail.
Original comment by try6...@gmail.com
on 16 Aug 2012 at 10:19
I ran across this issue while trying to crawl boingboing.net. After doing some
digging, I discovered that the way boingboing does their "share article" ->
email (javascript) breaks the crawler.
The issue is that the "mailto" link doesn't supply an email address; instead it
says "type email address here" so the if statement specifying "@" in
Parser.parse() doesn't get hit.
Original comment by uva...@gmail.com
on 2 Jan 2013 at 3:43
Original comment by avrah...@gmail.com
on 18 Aug 2014 at 3:13
http://eventiesagre.it/
&
http://boingboing.net/
Are getting crawled without any incident.
If anybody sees any error related to this bug then please report
Original comment by avrah...@gmail.com
on 19 Aug 2014 at 2:40
Fixed original IndexOutOfBoundsException in revision: 65954e30f219
Original comment by avrah...@gmail.com
on 19 Aug 2014 at 3:37
Original issue reported on code.google.com by
michele.mostarda
on 2 Mar 2012 at 2:03