GistLabs / mechanize

mechanize for java
http://gistlabs.com/software/mechanize-for-java/
Mozilla Public License 2.0
78 stars 21 forks source link

Redirect doesn't happen on a form sign in #36

Closed wr0ngway closed 12 years ago

wr0ngway commented 12 years ago

From the code in #34, after the form gets submitted, the resulting page is a redirect (with an empty body which is what cause the trace in #34). Shouldn't the result of the submit return a Page which is the result of performing the redirect?

jheintz commented 12 years ago

It should, but I'd assumed HttpClient would handle that. It might be that Mechanize will need to handle redirects.

MartinKersten commented 12 years ago

The Searching of Wikipedia from org.wikipedia makes a redirect that is handled automatically. So httpclient already handles it. Or maybe this is a different kind.

Did you set up the underlying HttpClient with a different setup? Is it PC or Android?

wr0ngway commented 12 years ago

All I did was run it as a junit test case from within a mechanize project within eclipse using the code as show in #34. It does some redirects, not all - the agent.get(manageKindleUrl) does a redirect to the signin page, which happens correctly in mechanize, but the form.submit() from the sign in page does not follow the redirect

jheintz commented 12 years ago

I was debugging through this test case to, a bit slow going.

Do either of you know the easiest way to configure the logging in HttpClient to be more verbose in this case?

On Wed, Oct 10, 2012 at 8:37 AM, Matt Conway notifications@github.comwrote:

All I did was run it as a junit test case from within a mechanize project within eclipse using the code as show in #34https://github.com/GistLabs/mechanize/issues/34 . It does some redirects, not all - the agent.get(manageKindleUrl) does a redirect to the signin page, which happens correctly in mechanize, but the form.submit() from the sign in page does not follow the redirect

— Reply to this email directly or view it on GitHubhttps://github.com/GistLabs/mechanize/issues/36#issuecomment-9303011.

John D. Heintz Agile, Lean, and everything in between

President, Gist Labs http://gistlabs.com Senior Consultant, Cutter Consortium http://cutter.com

Austin, TX (512) 633-1198

http://gistlabs.com

wr0ngway commented 12 years ago

http://hc.apache.org/httpcomponents-client-ga/logging.html

jheintz commented 12 years ago

Thanks for the pointer, I'd seen that and read about the various choices... Is it as simple as including our own log4j.properties file that configures HttpClient?

On Wed, Oct 10, 2012 at 10:32 AM, Matt Conway notifications@github.comwrote:

http://hc.apache.org/httpcomponents-client-ga/logging.html

— Reply to this email directly or view it on GitHubhttps://github.com/GistLabs/mechanize/issues/36#issuecomment-9307444.

John D. Heintz Agile, Lean, and everything in between

President, Gist Labs http://gistlabs.com Senior Consultant, Cutter Consortium http://cutter.com

Austin, TX (512) 633-1198

http://gistlabs.com

wr0ngway commented 12 years ago

Yes, if you have log4j as a dependency, otherwise I think you have to pass a system property to tell built in java.util.Logging where to find properties file.

MartinKersten commented 12 years ago

I used SimpleLogger (?) once to get log information about httpclient. I guess the logging properties might still be within the project (unsure). So in the end i added commons logging and used the java logging mechanism. Worked well.

MartinKersten commented 12 years ago

I remember. You have to set the logging parameters to use application parameters (setting them along with the java args). If you are using a file it wont be loaded / used. The logging properties file is only used for classes within the project not within the http client framwork.

MartinKersten commented 12 years ago

Anything new about this issue?

MartinKersten commented 12 years ago

Since this seams to be rather http client related i am closing this bug. If anyone wants to reopen it, please go ahead.

LouisStAmour commented 12 years ago

These docs indicate that HttpClient does indeed consider redirects a client responsibility: http://hc.apache.org/httpclient-3.x/redirects.html

However when you look for more details, there are some configuration params: http://hc.apache.org/httpcomponents-client-ga/tutorial/html/httpagent.html#d4e967 says ...

HttpClient handles all types of redirects automatically, except those explicitly prohibited by the HTTP specification as requiring user intervention. See Other (status code 303) redirects on POST and PUT requests are converted to GET requests as required by the HTTP specification.

These are parameters that be used to customize the behaviour of the default HttpClient implementation:

This stack overflow page has suggestions on how to set HANDLE_REDIRECTS: http://stackoverflow.com/questions/1519392/how-to-prevent-apache-http-client-from-following-a-redirect

As an example, Ruby's mechanize has the following method, which indicates that for Ruby, at least, the library manually deals with redirects, and in a much less granular fashion than HttpClient.

redirect_ok() - Also aliased as: follow_redirect? Controls how mechanize deals with redirects. The following values are allowed:

efung commented 11 years ago

If you're using HttpComponents HttpClient 4.2.x, there is a LaxRedirectStrategy which will follow 302 redirects on POST requests. I got it to work like this:

MechanizeAgent agent = new MechanizeAgent();
AbstractHttpClient client = agent.getClient();
client.setRedirectStrategy(new LaxRedirectStrategy());

I think you can do something similar in HttpClient 4.0.x by extending/implementing RedirectHandler. I haven't tried yet, but will do so, as my ultimate goal is to use Mechanize in Android.

jheintz commented 11 years ago

What would you recommend I put into the mechanize code? Should we default to a RedirectHandler for this?

On Tue, Jun 4, 2013 at 7:39 AM, Eric Fung notifications@github.com wrote:

If you're using HttpComponents HttpClient 4.2.x, there is a LaxRedirectStrategyhttp://hc.apache.org/httpcomponents-client-ga/httpclient/apidocs/org/apache/http/impl/client/LaxRedirectStrategy.htmlwhich will allow 302 redirects on POST requests. I got it to work like this:

MechanizeAgent agent = new MechanizeAgent(); AbstractHttpClient client = agent.getClient(); client.setRedirectStrategy(new LaxRedirectStrategy());

I think you can do something similar in HttpClient 4.0.x by extending/implementing RedirectHandler. I haven't tried yet, but will do so, as my ultimate goal is to use Mechanize in Android.

— Reply to this email directly or view it on GitHubhttps://github.com/GistLabs/mechanize/issues/36#issuecomment-18906073 .

John D. Heintz Agile, Lean, and everything in between

President, Gist Labs http://gistlabs.com Senior Consultant, Cutter Consortium http://cutter.com

Twitter: @jheintz http://twitter.com/jheintz Phone: 512-633-1198

http://gistlabs.com

efung commented 11 years ago

I wasn't suggesting that anything be added to Mechanize itself, just wanted to leave a note in case other people were trying to get POST redirects to work.

I just tried using HttpComponents 4.0-beta2 on the desktop, to try and match Android's version. It turns out that the DefaultRedirectHandler in this version already supports POST redirect.