headzoo / surf

Stateful programmatic web browsing in Go.
MIT License
1.48k stars 159 forks source link

Adds a generic Call method. #59

Closed danmux closed 7 years ago

danmux commented 7 years ago

Call takes a header param and returns the full repsonse.

danmux commented 7 years ago

doh! thanks @jtwatson

lxt2 commented 7 years ago

I'm keen on this in concept too, but it exposes an API unlike any of the other Browseable methods. I wonder if it might make more sense to introduce something like this at a deeper layer rather than poking a hole right through the Browser abstraction.

lxt2 commented 7 years ago

It dawns on me that if we exposed RoundTrip(r *http.Request) (*http.Response, error), you could use the Browser as a Transport in other spots.

headzoo commented 7 years ago

@lxt2 - You're right, at some point I made a conscious choice to leave out such a low level request method. If only to enforce the notion that Surf is designed to be a browser, and not just another HTTP client.

It might help if we ask some pull request authors to provide a use case so we can judge the usefulness of the changes. I can see how the Call() method would be useful. Particularly when the code is reading in requests from a database or text file. I'd be interested in hearing how @danmux plans on using it.

lxt2 commented 7 years ago

I can explain my usecase. I am working on adding an embedded JS engine and I am exposing a XHR interface. I'd like that to use the underlying Browser object.

Having access to Browser as a RoundTripper or having a Fetch or Call method would be very useful for those sorts of integrations.

lxt2 commented 7 years ago

My vote would be to expose a RoundTrip method, much like http.Client exposes Do as well as helpers for the various http verbs.

lxt2 commented 7 years ago

But there are probably other ways we could slice that, for instance we could make Browser.Client public, that way you could cast down to that and use it's internals if you so chose. Then we'd avoid changing the abstraction level of Browseable.

lxt2 commented 7 years ago

(Sorry for hijacking your issue @danmux)

headzoo commented 7 years ago

I'm fine with adding the Call() method. It sounds like a method most devs would expect to exist. I wonder though if the verb methods (e.g. Open(), Post(), Head()) should be piped to the Call() method. In other words, the verb methods would be simple convenience wrappers for the lower level Call() method.

lxt2 commented 7 years ago

If we go the Call direction, it should probably return just error and you could get at everything else via the usual ways.

headzoo commented 7 years ago

we could make Browser.Client public

I could be wrong, but I think devs would lose the use of the underlying mechanisms which are core to the library, e.g. the cookie management, history, bookmarks, and so on. The Call() method keeps them inside our sandbox, but loosens the reins a little bit so the devs can do what they want.

lxt2 commented 7 years ago

They would lose History and Bookmarks, but they would retain the headers and cookies, which seems to fit most usecases.

danmux commented 7 years ago

@lxt2 no problems for the hijack - good discussion. @headzoo yes my usecase was basically an XHR type request - some sites might expect authentication via 'scraping' and then grant access to various 'REST-ish' resources, as accessible via JS.

headzoo commented 7 years ago

I was thinking the same thing regarding JS based authentication. I also add bot tests to some of my sites using simple Javascript. Most bots fail the test because they can't execute the code. Adding the ability to execute simple JS would be a big plus for Surf.

danmux commented 7 years ago

yes @lxt2 headers and cookies are all that is needed in the HXR type case

danmux commented 7 years ago

@lx2 - so - is there an alternative PR on the way :)