Closed JaneJeon closed 3 years ago
I closed it thinking I can just use a beforeRequest hook; however, there is no way for a hook to tap into options.dnsCache or options.dnsLookup (i.e. the same thing that got uses - https://github.com/sindresorhus/got/blob/08d7a0f3285d5ac535aac873cd9090937cf008b6/source/core/options.ts#L2290), so this unfortunately prevents DNS lookups using what options users have supplied to Got (I really don't want to re-create CacheableLookup or dns.lookup just for the purposes of looking up DNS rather than using what's already in the options).
So, unfortunately, this still means there is no way to provide SSRF protection w/o getting direct access to the agents (which, by the way, I can't access from hook options either) :(
- got's extend functionality doesn't seemingly allow you to somehow get/override existing agent instances
instance.defaults.options.agent.http;
instance.defaults.options.agent.https;
instance.defaults.options.agent.http2;
const newInstance = instance.extend({
agent: {
http: ...,
https: ...,
http2: ...,
},
});
Disclaimer: Got maintainer here
- The TransformHeadersAgent is never exported from this library, which makes it impossible for me to instantiate agents myself and pass it in.
I'm not sure if we need to expose this but it seems like this issue could be fixed in Got?
/cc @mnmkng
however, there is no way for a hook to tap into options.dnsCache or options.dnsLookup
Can be done via overriding dnsCache.lookupAsync
or specifying custom DNS servers. Opened https://github.com/szmarczak/cacheable-lookup/issues/41 as this would be a nice feature.
Alternatively you can do something like this:
import dns from 'dns';
const instance = gotScraping.extend({
hooks: {
beforeRequest: [
async options => {
const {address} = await dns.promises.lookup(options.url.hostname);
if (isIpPrivate(address)) {
throw new Error('Forbidden');
}
}
]
}
});
The cost is probably +20ms per request.
I'm not sure if we need to expose this but it seems like this issue could be fixed in Got?
Which part could be fixed in Got?
I'm also not a fan of exporting the agent because it's internal implementation. On the other hand it's true that it not being exported basically makes this library unusable with custom agents, since a lot of the important logic is in the agents and you have no way of keeping that. Adding an API that would allow agent chaining or some sort of agent middleware sounds like a crazy overkill for this library, but it could be something that makes sense in Got. If it's possible to do it with a nice interface.
Oh wow, thanks for the quick response. I ended up just using the builtin dns.lookup, but I agree that the alternatives don’t look so pretty either.
Overriding agents is definitely ugly as @mnmkng points out as it’s internal implementation, getting opts.dnsLookup || opts.dnsCache[‘lookup’] seems to be not possible as of now as @szmarczak points out, and the solution I ended up on (just using builtin dns and relying on the system to cache DNS lookups) is also ugly for obvious reasons.
Welp, feel free to close this issue as we’ve obviously reached some possible solutions here, but I think that we have a conversation to have about exposing got’s guts (ha ha) in a manner that doesn’t require hacky solutions (or non-solutions) like this.
I'm not sure if we need to expose this but it seems like this issue could be fixed in Got?
I was thinking about forcing agents to use the lookup function provided via options.dnsLookup
. Given this some thought this is not a good idea, as the agent may implement its own logic in regards to DNS.
Setting up a custom DNS server seems to be the easiest solution. I recommend https://coredns.io/
Setting up a custom DNS server seems to be the easiest solution.
Ah, I was thinking of setting up OS-level DNS caching, but I’m guessing what you’re suggesting is to setup an external DNS server to 1. basically proxy to an ACTUAL dns server like 1.1.1.1/8.8.8.8, 2. keep a cache of lookups/rejections? @szmarczak
Correct. Alternatively you can set up a Got handler, attach a request
listener on the promise/stream, on that request attach a socket
listener and check the IP address there.
Okay, I’ll close this for now, but it still would be good to have access to options.dnsLookup || options.dnsCache[‘lookup’] (what got uses internally), but that’s more of a got issue.
Hi, first of all, thanks for the library.
I'm trying to implement SSRF protection in-place (without relying on proxies), which requires modifying the http/https agents and passing the modified instances into got.
However, that's not really possible due to two things:
This unfortunately prevents me from being able to use the library. Is there any way around this?
Thanks