encode / httpx

A next generation HTTP client for Python. 🦋
https://www.python-httpx.org/
BSD 3-Clause "New" or "Revised" License
13.33k stars 848 forks source link

Allow the specifying of which system IP address to use for client connections. #755

Closed eparra closed 4 years ago

eparra commented 4 years ago

Allow the specifying of which system IP address to use for client connections. If you look at http.client.HTTPConnection, it allows you pass a source address:

class http.client.HTTPConnection(host, port=None, [timeout, ]source_address=None, blocksize=8192)

https://docs.python.org/3/library/http.client.html

tomchristie commented 4 years ago

Thanks, that's a really interesting request.

I think we're likely at a stage in the project where we'd probably push back on new API unless we consider it absolutely clear-cut must have functionality, and that some things that don't fall into that category could go into a maybe later, but let's get more real-world feedback first, so my initial reaction would probably be to put this into the "interesting, maybe later" pile...

For a bit of context, what's the motivation / use-case here? Are there other high-level Python HTTP clients that make this available?

thebigmunch commented 4 years ago

Are there other high-level Python HTTP clients that make this available?

tomchristie commented 4 years ago

I can find it in urllib3, https://urllib3.readthedocs.io/en/latest/reference/#urllib3.connection.HTTPConnection but I can't find it in requests?

tomchristie commented 4 years ago

Closed as "out of scope" for requests, https://github.com/psf/requests/issues/394 but available in the toolbelt... https://toolbelt.readthedocs.io/en/latest/adapters.html#sourceaddressadapter

thebigmunch commented 4 years ago

I assumed for requests itself. But, yeah. Still available in semi-official capacity.

Edit: And the rest were from code searches, though they're not all spelled the same way

eparra commented 4 years ago

Thanks, that's a really interesting request.

I think we're likely at a stage in the project where we'd probably push back on new API unless we consider it absolutely clear-cut must have functionality, and that some things that don't fall into that category could go into a maybe later, but let's get more real-world feedback first, so my initial reaction would probably be to put this into the "interesting, maybe later" pile...

For a bit of context, what's the motivation / use-case here? Are there other high-level Python HTTP clients that make this available?

The use case I have run into many times now is the need to simulate a pool of users, sourcing from unique IP addresses, for simulating users. This is not possible with requests and has been requested as an enhancement many times. Today I find users spawning off multiple containers to accomplish this vs. using a single machine, with multiple secondary IP addresses bound to the NIC, to accomplish this. To add more color to the use case: testing a web proxy, FW, wan optimizer, and network traffic probes, ...etc, whereby a single IP requesting many webpages.

bwelling commented 4 years ago

I was just looking for this functionality in httpx, for a use case of sending DoH (DNS over HTTPS) requests, from dnspython. Currently, dnspython uses requests to send queries, and requests_toolbelt.adapters.source.SourceAddressAdapter (when necessary) to set the source address, but since DoH is supposed to use HTTP/2, it would be nice to use httpx instead.

tomchristie commented 4 years ago

I'm going to currently close this off as out of scope.

We might be able to consider a nicely done PR to httpcore that implements this, but I don't want to consider it a priority unless someone's willing to step up and take it on.

kotori2 commented 3 years ago

Since there is no proper documentation for this, here is how to use local address:

>>> import httpx
>>> from httpcore import SyncConnectionPool
>>> transport = SyncConnectionPool(local_address="xxxx:xxxx:xxxx:xxxx:f3d2:3ef7:28cd:7b60", http2=True)
>>> client = httpx.Client(transport=transport)
>>> r = client.get("https://api6.ipify.org")
>>> r.text
'xxxx:xxxx:xxxx:xxxx:f3d2:3ef7:28cd:7b60'
p-i- commented 2 years ago

To answer earlier question: how to do it with requests: https://stackoverflow.com/questions/48776564/python-and-c-sharing-the-same-memory-resources

p-i- commented 2 years ago

I'm trying to get an async version working:

import trio
import httpx

from httpcore import AsyncConnectionPool

URL = 'https://api4.ipify.org/'
IP = '...'  # my public IP4 address

async def run_client(ip):
    # works
    async with httpx.AsyncClient() as client:
        response = await client.get(URL)
        print(response.status_code, len(response.content))

    # fails
    async with AsyncConnectionPool(local_address=ip, http2=True) as transport:
        async with httpx.AsyncClient(transport=transport) as client:
            response = await client.get(URL)
            print(ip, response.status_code, len(response.content))

async def main():
    async with trio.open_nursery() as n:
        n.start_soon(run_client, IP)

trio.run(main)

... but getting an error:

root@test-host-1:~# python3 trio_scraper.py 
200 14
Traceback (most recent call last):
  File "trio_scraper.py", line 22, in <module>
    trio.run(main)
  File "/usr/local/lib/python3.8/dist-packages/trio/_core/_run.py", line 1932, in run
    raise runner.main_task_outcome.error
  File "trio_scraper.py", line 20, in main
    n.start_soon(run_client, '209.250.240.59')
  File "/usr/local/lib/python3.8/dist-packages/trio/_core/_run.py", line 815, in __aexit__
    raise combined_error_from_nursery
  File "trio_scraper.py", line 15, in run_client
    response = await client.get(URL)
  File "/usr/local/lib/python3.8/dist-packages/httpx/_client.py", line 1736, in get
    return await self.request(
  File "/usr/local/lib/python3.8/dist-packages/httpx/_client.py", line 1513, in request
    return await self.send(request, auth=auth, follow_redirects=follow_redirects)
  File "/usr/local/lib/python3.8/dist-packages/httpx/_client.py", line 1600, in send
    response = await self._send_handling_auth(
  File "/usr/local/lib/python3.8/dist-packages/httpx/_client.py", line 1628, in _send_handling_auth
    response = await self._send_handling_redirects(
  File "/usr/local/lib/python3.8/dist-packages/httpx/_client.py", line 1665, in _send_handling_redirects
    response = await self._send_single_request(request)
  File "/usr/local/lib/python3.8/dist-packages/httpx/_client.py", line 1702, in _send_single_request
    response = await transport.handle_async_request(request)
  File "/usr/local/lib/python3.8/dist-packages/httpcore/_async/connection_pool.py", line 206, in handle_async_request
    scheme = request.url.scheme.decode()
AttributeError: 'str' object has no attribute 'decode'
tomchristie commented 2 years ago

You need to be using async with httpx.HTTPTransport(local_address=ip, http2=True) as transport:

p-i- commented 2 years ago

Thanks! However if I switch that out, I now get:

  File "trio_scraper.py", line 20, in run_client
    async with httpx.HTTPTransport(local_address=ip, http2=True) as transport:
AttributeError: __aexit__
p-i- commented 2 years ago

ok gottit! Here's how to do async:

import trio
import httpx

URL = 'https://api4.ipify.org/'
IP = '...'  # use your box's public IPv4

async def run_client(ip):
    async with httpx.AsyncHTTPTransport(local_address=ip, http2=True) as transport:
        async with httpx.AsyncClient(transport=transport) as client:
            response = await client.get(URL)
            print(ip, response.status_code, len(response.content))

async def main():
    async with trio.open_nursery() as n:
        n.start_soon(run_client, IP)

trio.run(main)

Thanks @tomchristie for shaking it loose

p-i- commented 2 years ago

Just a note that trying a different target URL I ran into ModuleNotFoundError: No module named 'h2' and needed pip install h2. As per guidelines I floated it on https://gitter.im/encode/community rather than file an issue.