easoncxz / twitanalysis

Dig your Twitter data
https://easoncxz.github.io/twitanalysis
Other
1 stars 0 forks source link

Make a pass-through endpoint #7

Closed easoncxz closed 4 years ago

easoncxz commented 4 years ago

Continuing on the ideas from https://github.com/easoncxz/twitanalysis/issues/5 :

What this means, is to define an endpoint, say:

These endpoints should:

Technical subtleties:

Design goals:

Outcome:

easoncxz commented 4 years ago

There is a mismatch!

These two are not the same type!

After a couple of seconds of thought, I realised that this conversion must have been done in any proxy-server implementation in Haskell that is built on both the http-client library and wai, which is highly likely, because both wai and http-client are well-known, long-existant, and very dominant packages in the Haskell ecosystem.

And voilà! I found this function:

Since there is the very-obvious usage-example from http-proxy of:

-- Run a HTTPS and HTTPS proxy on port 3128.
import Network.HTTP.Proxy

main :: IO ()
main = runProxy 3128

I just had to dig in starting from runProxy, and look for functions with relevant names and types. Here the entirety of doUpstreamRequest:

doUpstreamRequest :: Settings -> HC.Manager -> (Wai.Response -> IO Wai.ResponseReceived) -> Wai.Request -> IO Wai.ResponseReceived
doUpstreamRequest settings mgr respond mwreq
    | Wai.requestMethod mwreq == "CONNECT" =
        respond $ responseRawSource (handleConnect mwreq)
                    (Wai.responseLBS HT.status500 [("Content-Type", "text/plain")] "No support for responseRaw")
    | otherwise = do
        hreq0 <- HC.parseRequest $ BS.unpack (Wai.rawPathInfo mwreq <> Wai.rawQueryString mwreq)
        let hreq = hreq0
                { HC.method = Wai.requestMethod mwreq
                , HC.requestHeaders = filter dropRequestHeader $ Wai.requestHeaders mwreq
                , HC.redirectCount = 0 -- Always pass redirects back to the client.
                , HC.requestBody =
                    case Wai.requestBodyLength mwreq of
                        Wai.ChunkedBody ->
                            HC.requestBodySourceChunkedIO (sourceRequestBody mwreq)
                        Wai.KnownLength l ->
                            HC.requestBodySourceIO (fromIntegral l) (sourceRequestBody mwreq)
                -- Do not touch response body. Otherwise there may be discrepancy
                -- between response headers and the response content.
                , HC.decompress = const False
                }
        handle (respond . errorResponse) $
            HC.withResponse hreq mgr $ \res -> do
                let body = mapOutput (Chunk . fromByteString) . HCC.bodyReaderSource $ HC.responseBody res
                    headers = (CI.mk "X-Via-Proxy", "yes") : filter dropResponseHeader (HC.responseHeaders res)
                respond $ responseSource (HC.responseStatus res) headers body
      where
        dropRequestHeader (k, _) = k `notElem`
            [ "content-encoding"
            , "content-length"
            ]
        dropResponseHeader (k, _) = k `notElem` []

        errorResponse :: SomeException -> Wai.Response
        errorResponse = proxyOnException settings . toException
easoncxz commented 4 years ago

Let's not get too carried-away with this proxy business. Let's first try the inner-half of this proxy pass-thru: making just a plain, hard-coded, authenticated OAuth request to the Twitter API from within Haskell. Surprisingly, I still haven't done so by now.

Journal entry: https://github.com/easoncxz/twitanalysis/wiki/journal-2020-09-08:-A-%22bug%22-in-%60oauthenticated%60

TL;DR: it's done.

image

easoncxz commented 4 years ago

This is getting very weird. My pass-thru endpoint seems to be working with at least some GET requests: (I realised that I can pick up the browser-sessions in curl by just passing the Cookie header)

eason@eason-air ‹ master ●● › (2020-09-11 20:24:04 NZST) ~/pg/twitanalysis
[18] % curl -i 'http://localhost:5000/to-twitter/account/verify_credentials.json' -H 'Accept: application/json' -H 'Cookie: sid=YfWKb4cQwBxwsP0B'

HTTP/1.1 200 OK
x-xss-protection: 0
x-twitter-response-tags: BouncerCompliant
x-transaction: 0086fab100fa741c
x-response-time: 190
x-rate-limit-reset: 1599814097
x-rate-limit-remaining: 74
x-rate-limit-limit: 75
x-frame-options: SAMEORIGIN
x-content-type-options: nosniff
x-connection-hash: f1f90b81a10149dbb6edcab70303ccf3
x-access-level: read-write
strict-transport-security: max-age=631138519
status: 200 OK
set-cookie: guest_id=v1%3A159981319796045974; Max-Age=63072000; Expires=Sun, 11 Sep 2022 08:33:17 GMT; Path=/; Domain=.twitter.com; Secure; SameSite=None
server: tsa_l
pragma: no-cache
last-modified: Fri, 11 Sep 2020 08:33:17 GMT
expires: Tue, 31 Mar 1981 05:00:00 GMT
date: Fri, 11 Sep 2020 08:33:18 GMT
content-type: application/json;charset=utf-8
content-length: 987
content-encoding: gzip
content-disposition: attachment; filename=json.json
cache-control: no-cache, no-store, must-revalidate, pre-check=0, post-check=0

{"id":243138168,"id_str":"243138168","name":"Eason C","screen_name":"easoncxz","location":"Aotearoa NZ","description":"A computer person. An emotional rationalist. An individual trapped in human society. \u00b6 Languages: CN-5, EN-4, FR-1.","url":"http:\/\/t.co\/CUazHDsiXY","entities":{"url":{"urls":[{"url":"http:\/\/t.co\/CUazHDsiXY","expanded_url":"http:\/\/blog.easoncxz.com","display_url":"blog.easoncxz.com","indices":[0,22]}]},"description":{"urls":[]}},"protected":false,"followers_count":1721,"friends_count":738,"listed_count":27,"created_at":"Wed Jan 26 11:34:29 +0000 2011","favourites_count":21272,"utc_offset":null,"time_zone":null,"geo_enabled":false,"verified":false,"statuses_count":39693,"lang":null,"status":{"created_at":"Fri Sep 11 07:55:41 +0000 2020","id":1304327739692859392,"id_str":"1304327739692859392","text":"Sending an arbitrary tweet...","truncated":false,"entities":{"hashtags":[],"symbols":[],"user_mentions":[],"urls":[]},"source":"\u003ca href=\"http:%
eason@eason-air ‹ master ●● › (2020-09-11 20:21:08 NZST) ~/pg/twitanalysis
[0] % curl -i -H 'Content-Type: application/x-www-form-urlencoded' -H 'Accept: application/json' -H 'Cookie: sid=YfWKb4cQwBxwsP0B' -X POST -d 'status=Tweeting%20using%20curl%20on%20the%20command%20line%2C%20but%20hitting%20my%20server%20which%20performs%20OAuth%20signing' 'http://localhost:5000/to-twitter/statuses/update.json'

HTTP/1.1 403 Forbidden
x-xss-protection: 0
x-twitter-response-tags: BouncerCompliant
x-transaction: 00d9a07500954719
x-response-time: 164
x-frame-options: SAMEORIGIN
x-content-type-options: nosniff
x-connection-hash: 18dca92141e0fb5169393bd497ede26e
x-access-level: read-write
strict-transport-security: max-age=631138519
status: 403 Forbidden
set-cookie: guest_id=v1%3A159981258794926745; Max-Age=63072000; Expires=Sun, 11 Sep 2022 08:23:07 GMT; Path=/; Domain=.twitter.com; Secure; SameSite=None
server: tsa_l
pragma: no-cache
last-modified: Fri, 11 Sep 2020 08:23:07 GMT
expires: Tue, 31 Mar 1981 05:00:00 GMT
date: Fri, 11 Sep 2020 08:23:08 GMT
content-type: application/json;charset=utf-8
content-length: 98
content-encoding: gzip
content-disposition: attachment; filename=json.json
cache-control: no-cache, no-store, must-revalidate, pre-check=0, post-check=0

curl: (18) transfer closed with 25 bytes remaining to read
{"errors":[{"code":170,"message":"Missing required parameter: status."}]}

It looks like I got the idea basically right, but the way I'm handling HTTP request bodies definitely has some issues.

Probably need to study http-proxy's doUpstreamRequest source better.

easoncxz commented 4 years ago

Things appear to be working now.

image

easoncxz commented 4 years ago

It's great, I off-handedly tried out these two APIs, and they work just fine:

(I tried them over curl by stealing the cookie from my Firefox dev console and pasting it to the command line.)

No reason why any endpoint shouldn't work.

easoncxz commented 4 years ago

This is looking really good: I've implemented the same two methods (one GET and one POST) in React-Redux UI: