Closed GoogleCodeExporter closed 9 years ago
I've gone back to the weekly release from February 4th and still see this
issue. Not sure if there's something wrong with my implementation or if this
API has been broken for a while...
Original comment by dave.h...@gmail.com
on 26 Mar 2013 at 7:45
Does this happen only on bodgeit store? Because I tested it with a couple of
other websites and it seems like it's working...
Apparently, from the logs, the response returned by querying:
http://localhost:8080/bodgeit doesn't seem to be text...
Original comment by cosminst...@gmail.com
on 26 Mar 2013 at 10:52
It does happen on other sites - Dave found the problem when trying to spider a
Mozilla site.
Does the spider follow 302 redirections?
Because the response to GET http://localhost:8080/bodgeit is
HTTP/1.1 302 Found
Server: Apache-Coyote/1.1
Location: http://localhost:8080/bodgeit/
Date: Wed, 27 Mar 2013 08:54:01 GMT
Content-length: 0
While the one to GET http://localhost:8080/bodgeit/ (note the trailing slash) is
HTTP/1.1 200 OK
Server: Apache-Coyote/1.1
Set-Cookie: JSESSIONID=F4A23C270579454C60EE56AF323CE69A; Path=/bodgeit/;
HttpOnly
Content-Type: text/html;charset=ISO-8859-1
Content-Length: 3171
Date: Wed, 27 Mar 2013 08:52:34 GMT
etc...
And although I specify http://localhost:8080/bodgeit/ in the API call it looks
like the spiders trying to access http://localhost:8080/bodgeit
Original comment by psii...@gmail.com
on 27 Mar 2013 at 8:56
I've attached a zap.log file of a job that was using the API to try to spider
the Mozilla Marketplace dev site. It never returns from the API call... In
other instances (such as Bodge It) it returns quickly but doesn't appear to
have spidered at all. The two issues may be unrelated.
Original comment by dave.h...@gmail.com
on 27 Mar 2013 at 11:58
Attachments:
The active scan is also failing for me using the API with very similar
behaviour. Could this be related or would you prefer me to open a new issue?
Original comment by dave.h...@gmail.com
on 28 Mar 2013 at 12:48
It might be related as the scanner will only scan the pages that have been
accessed, either manually or with the spider. If your use case depends on the
pages found by the spider (and the spider is not spidering) then most likely is
related to this issue(s). If it doesn't depend on the spider it would be better
to create a new issue.
I've raised two issues (Issue 582 and Issue 583) to fix the exceptions that
were logged in the previous attached log file (comment #4) (those issues do not
affect this issue, though).
Original comment by THC...@gmail.com
on 30 Mar 2013 at 1:37
I've just committed the changes for proper handling of HTTP redirection by the
spider (r3020). Thanks for pointing this out, Dave.
Original comment by cosminst...@gmail.com
on 7 Apr 2013 at 11:12
Original comment by psii...@gmail.com
on 8 Apr 2013 at 8:07
I can confirm that this works for me in the latest weekly release.
Original comment by dave.h...@gmail.com
on 15 Apr 2013 at 9:10
\o/
Original comment by psii...@gmail.com
on 15 Apr 2013 at 9:11
Fixed in 2.1.0
Original comment by psii...@gmail.com
on 18 Apr 2013 at 9:49
Original issue reported on code.google.com by
psii...@gmail.com
on 26 Mar 2013 at 6:20