Closed bboe closed 7 years ago
Note this line in the traceback:
return proxy_bypass_macosx_sysconf(host)
Also note that this is code that happens in requests prior to betamax taking control over the request/response flow.
Short of monkey-patching the standard library on a very specific operating system only when a cassette has already been recorded seems like overkill and out of the scope of this project to me.
Do you agree?
Also note that this is code that happens in requests prior to betamax taking control over the request/response flow.
I hadn't realized that. From my perspective, I'm calling get
after associating betamax with the session. Thus it seems reasonable that there would be no network side effects when a cassette is present.
Rather than monkey patching the standard library, perhaps betamax could augment requests to bypass the check for proxies when a cassette is present.
With that said, I understand if you feel it's out-of-scope. Perhaps it's worth mentioning the issue in the docs because it's fairly significant. Usually my test suite runs in fewer than half a second, but in the cases, for whatever reason, when it starts checking this proxy setting it takes > 30 seconds because for every request there is a DNS lookup. Maybe there's a separate issue here which is preventing those DNS responses from being cached; they're all for the same host.
If I wanted to do the latter, do you have understanding of what is actually happening here? I'd be happy to write something up for the documentation if I knew why these DNS requests were always happening.
Rather than monkey patching the standard library, perhaps betamax could augment requests to bypass the check for proxies when a cassette is present.
So the point of Betamax working the way it does is to avoid monkey-patching anything. That said, if we want to prevent requests from using this, we need to monkey-patch requests (but, again, only when the cassette is already recorded).
I understand the frustration with this.
Maybe there's a separate issue here which is preventing those DNS responses from being cached; they're all for the same host.
That would be a requests specific question IMO. I can look into it tonight maybe.
Hm, so this only ever happens when handling redirects. The proxy_bypass
stdlib code is used by requests.utils.should_bypass_proxies
which is used elsewhere in requests.utils.get_environ_proxies
.
get_environ_proxies
is used in Session#merge_environment_settings
(which is called from Session#request
)
And both get_environ_proxies
and should_bypass_proxies
are used in SessionRedirectMixin#rebuild_proxies
.
In both cases, they are guarded by the trust_env
setting on the Session object. In other words, if we want to avoid this without monkey-patching, we could force that to be False
when we attach our adapters if we're not planning on recording new interactions. I'm a little hesitant to do that though. Would a sufficient way of addressing this be to document it?
Would a sufficient way of addressing this be to document it?
Yes, I think explaining why this occurs and a potential work around (my fix below) would be suitable.
# Temporarily work around issue with gethostbyname on OS X
import socket
socket.gethostbyname = lambda x: '127.0.0.1'
I do, however, like your solution of forcing trust_env
to be False
when not recording. I'm curious as to what your hesitation is. Alternatively, perhaps there could be a way for the user to hook betamax in such cases to change that setting when recording is not to take place? That way betamax needn't always do it, but it can be configured (without monkey patching). Thoughts?
I'm curious as to what your hesitation is.
We don't change anything else about the Session besides its adapters. I can't think of a case where someone might be subclassing a Session and looking at trust_env
but I think I'd like a way to let the user say "Do not override my setting for trust_env
" and I'd be happy. That said, we won't know if we should override trust_env
until the user calls Betamax#use_cassette
. That said, use_cassette
does not start recording anything. It just loads a cassette. So we would actually need to do this in Betamax#start
but the problem with that is that someone can do:
with Betamax(session) as recorder:
recorder.use_cassette('cassette_name')
So we'll have to track extra state on the Betamax
object which makes me a sad :panda_face:
Awesome. Thanks for the update.
Closing this issue as I feel the documentation addition is sufficient.
I finally got annoyed that my integration tests would periodically slow down. After running a profiler I saw that
socket.gethostbyname
was the bottleneck. I adapted the betamax example to show the stacktrace that shows thatsocket.gethostbyname
is indeed being called. I'm guessing that it probably shouldn't be.Here's the example
socket.gethostbyname
does not get called when I tested from a linux machine.