swift-server / async-http-client

HTTP client library built on SwiftNIO
https://swiftpackageindex.com/swift-server/async-http-client/main/documentation/asynchttpclient
Apache License 2.0
904 stars 114 forks source link

Proxy server implementation example needed #757

Open advanc3dUA opened 1 month ago

advanc3dUA commented 1 month ago

Hello, I need a proxy server endpoint for my Vapor project. The goal is to get the endpoint, which will take a URL as a parameter, get the page of that URL, modify it, and return it to the user. I've searched for an example here and over the internet but had no luck. Can I get the fully working thing I am trying to do with async-http-client? I need to proxy a page with JS, Google authorization, so I need to forward cookies etc.

The only useful thing I have found was the implementation of the proxy for the different Swift serverside framework - Hummingbird. It also uses async-http-client and the author says, that the implementation in Vapor will look similar.

Well, I am not a pro but here is what I have at the moment:

Proxy Middleware:

func respond(to request: Vapor.Request, chainingTo next: Vapor.AsyncResponder) async throws -> Vapor.Response {
        // Convert to HTTPClient.Request
        let ahcRequest = try await request.ahcRequest(host: target)

        // Execute the request using httpClient
        let clientResponse = httpClient.execute(request: ahcRequest)

        // Convert HTTPClient.Response to Vapor.Response
        var vaporResponse = try await clientResponse.get().vaporResponse

        return vaporResponse
}

HTTP.Response -> Vapor.Response:

extension HTTPClient.Response {
    var vaporResponse: Response {
        var headers = HTTPHeaders()
        self.headers.forEach { headers.add(name: $0.name, value: $0.value) }

        let body: Response.Body
        if let buffer = self.body {
            body = .init(buffer: buffer)
        } else {
            body = .empty
        }

        return Response(
            status: .init(statusCode: Int(self.status.code)),
            headers: headers,
            body: body
        )
    }
}

Vapor.Request -> async-http-client.Request:

extension Request {
    func ahcRequest(host: String) async throws -> HTTPClient.Request {
        let collectedBuffer = try await self.body.collect(max: 1024 * 1024).get()

        return try HTTPClient.Request(
            url: host,
            method: self.method,
            headers: self.headers,
            body: collectedBuffer.map { HTTPClient.Body.byteBuffer($0) }
        )
    }
}

When I try to proxy the simple webpage with only HTML written - it works. But something more complicated fails. If you have an example of how I can implement a working proxy for the modern websites, please share it with me.

Thanks.

Lukasa commented 1 month ago

So generally speaking this should work. Can you produce an example of something that fails and explain how it fails?

advanc3dUA commented 1 month ago

So generally speaking this should work. Can you produce an example of something that fails and explain how it fails?

When a site contains JavaScript code, the returned response to the initiating client includes implicit JS links. The client attempts to fetch that content from the proxy, where it is not present. To address this, I need to modify the body of the response before returning it.

First, I must decompress the body, change all implicit links to explicit links so they will be sent to my proxy, which will then redirect them to the destination website (direct access is not possible due to CORS). After these modifications, I will recompress the body and send it back to the client.

This is only part of the problem. I also need to manage cookies, ensuring they are properly transferred in both directions.

There may be other issues I haven't anticipated yet.

While this is possible to implement, it will take a considerable amount of time. Therefore, I'm asking for any existing examples if someone has already written something similar.

Lukasa commented 1 month ago

Ah, I see. So attempting to implement a proxy this way, without the cooperation of either the server or the client, is somewhere between very challenging and impossible. As you say, you need to do a substantial amount of rewriting of code to force it back to your proxy.

Is there any reason that you should not ask the client to include you? Most HTTP clients support having a proxy configured, and in that case they'll route all requests through you, regardless of what the URL says.