denoland / deno

A modern runtime for JavaScript and TypeScript.
https://deno.com
MIT License
94.54k stars 5.25k forks source link

Add URLs to `--allow-net` #5242

Open bxff opened 4 years ago

bxff commented 4 years ago

Deno should allow URLs to be allowed in --allow-net and or other than domain names and ports pair.

As of current to run this simple JavaScript file:

fetch('http://gist.githubusercontent.com/KSXGitHub/62cb052060dc169796ab3698ff53c33b/raw/9d7d84910b344eb8580be8b91cf539e18e565e5d/init.sh/').then( (resp) => {
    console.log(resp)
} )

I would need to allow all URLs possible with gist.githubusercontent.com, or allow any Domain names URL(which both are not fully secure):

All URLs possible with gist.githubusercontent.com:

deno --allow-net="gist.githubusercontent.com" test.js                                 Does compile
Response {
  responce data...
}

Allow any Domain names URL:

deno --allow-net test.js
Does compile
Response {
  responce data...
}

And if I try running the secure way, I get en error:

$ deno --allow-net="http://gist.githubusercontent.com/KSXGitHub/62cb052060dc169796ab3698ff53c33b/raw/9d7d84910b344eb8580be8b91cf539e18e565e5d/init.sh" test.js
Does compile
error: Uncaught PermissionDenied: network access to "http://gist.githubusercontent.com/KSXGitHub/62cb052060dc169796ab3698ff53c33b/raw/9d7d84910b344eb8580be8b91cf539e18e565e5d/init.sh/", run again with the --allow-net flag
    at unwrapResponse ($deno$/ops/dispatch_json.ts:43:11)
    at Object.sendAsync ($deno$/ops/dispatch_json.ts:98:10)
    at async fetch ($deno$/web/fetch.ts:591:27)

Thanks, Dex Devlon

mathiasrw commented 4 years ago

This seems super useful. I suggest that the notation can start with a star to indicate multiple levels of subdomains like

deno --allow-net="*.githubusercontent.com" test.js  

I also suggest making it a comma-separated list so that you can add several urls

deno --allow-net="*.githubusercontent.com, github.com, mydomain.lol" test.js  
bartlomieju commented 4 years ago

--allow-net supports providing a list of URLs:

$ deno run --allow-net=github.com,google.com mod.ts
bartlomieju commented 4 years ago

Closing as resolved

skariel commented 3 years ago

@bartlomieju allow-net allows only domains but no URLs. For e.g google.com is allowed, if used it will also automatically allow google.com/about but if you want to allow only google.com/about it is not currently supported by --allow-net. This has nothing to do with comma separated domain names. I think the issue is not resolved just yet

nayeemrmn commented 3 years ago

As I said in IRC once, I think this should stay as it is.

URL paths are arbitrary strings which are prone to subtle formatting differences and it's not clear how the scopes should work. While it's intuitive that read permission to /foo should imply read permission to /foo/bar, it's not sensible that net permission to http://baz.com/foo implies http://baz.com/foo/bar. This doesn't leave any good way of whitelisting at the TCP level which should take priority.

It's important to offer whitelisting against who the runtime can establish connections with. There's no reason to include random unstructured parts of the request payload as filter parameters. The pathname of a URL is effectively that.

caspervonb commented 3 years ago

allow-net allows only domains but no URLs. For e.g google.com is allowed, if used it will also automatically allow google.com/about but if you want to allow only google.com/about it is not currently supported by --allow-net. This has nothing to do with comma separated domain names. I think the issue is not resolved just yet

Hm, how would that even work? as you connect to hosts not fully qualified urls.

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

jsejcksn commented 2 years ago

@bxff @mathiasrw @skariel There are other mechanisms for network access besides fetch (e.g. Deno.connect). If you would just like to restrict fetch, you can proxy it.

r3h0 commented 2 years ago

From this comment, it appears that the existing implementation applies only to Layer 4.

This doesn't leave any good way of whitelisting at the TCP level which should take priority.

I agree that the TCP level wouldn't work well; hence I gather that this request is to add support for L7 permissions, where url-based filters are common. For example, various L7 reverse proxies or service meshes that define allowed routes by host, port, and path (and also headers, etc, which would be out of scope for this issue).

My use case is similar to the original example.

I would need to allow all URLs possible with gist.githubusercontent.com, or allow any Domain names URL(which both are not fully secure)

With today's L4 support, if the external service is multi-tentant like github.com, gitlab.com, or some third party API, the application could leak data or run code from another tenant. Another specific example: IIRC, AWS S3 buckets are defined by path rather than hostname, so you have to allow-net all S3 buckets.

I love the idea of the allow-net feature, but the security benefit is limited in the case of third party services.

r3h0 commented 2 years ago

With the caveat that this is my first time looking at Deno's source code, the feature doesn't appear to require major changes to the implementation of allow-net. fetch already calls check_net_url per request, not per connection. The implementation change might be in permissions.rs, where a path check would be performed in addition to the host and port.

There are other mechanisms for network access besides fetch (e.g. Deno.connect).

Like stated earlier, HTTP paths are an L7 thing and shouldn't be mixed together with TCP. My instinct would be to disallow Deno.connect for endpoints where a path is specified in --allow-net.

andykais commented 1 year ago

Throwing my two cents in here that this would be handy for handling access to static lib files and FFI. As we know, ffi is effectively removing the security sandbox. In the case where your static lib is installed elsewhere by the package manager, this is not a problem. However, when we start pulling hosted remote static libs down, theres no good restriction we can put in place. For example, if I wanted to build a portable app including a binary, it might look like this:

deno run --allow-net=raw.githubusercontent.com/andykais/calculator --allow-read --allow-write --allow-ffi https:/deno.land/x/calculator "2 + 2"

What I want to happen here is for deno to restrict access to this domain + path, and any subpaths. Instead, any content on raw.githubusercontent.com is accessible, and some malicious script in the dependency tree might try to take advantage of that.

This doesn't leave any good way of whitelisting at the TCP level which should take priority

Im not sure the end user cares if its a whitelist at the TCP level, or if its a url pattern check. As the end user, deno is the security sandbox, and I am trusting that the deno binary is executing things I allowed it to execute.

URL paths are arbitrary strings which are prone to subtle formatting differences and it's not clear how the scopes should work. While it's intuitive that read permission to /foo should imply read permission to /foo/bar, it's not sensible that net permission to http://baz.com/foo implies http://baz.com/foo/bar. This doesn't leave any good way of whitelisting at the TCP level which should take priority.

This is how read and write whitelisting works already. Wouldnt it be intuitive to follow that same paradigm here? Ill admit I am biased because the resources I want to access are literally structured like a file system on github. There are certainly more complicated route definitions in web applications, but that doesnt imply that they need to be incorporated into the permissions system.

dinvlad commented 1 year ago

Without this capability for --allow-net, it’s much less useful in practice of the modern cloud development, where the backend just calls a bunch of cloud APIs, and those usually separate tenants by the path and not the hostname. S3 and a few others might be an exception, but we can’t rely on that in the general case. Additionally, any kind of proxying would severely limit the high-throughput applications (e.g. fetching from cloud storage that’s not hostname-based like S3).

henricazottes commented 1 year ago

Following @jsejcksn advice, here is an example for proxying fetch:

// proxied_fetch.ts

const allowedUrl = Deno.env.get("ALLOWED_URL");

const fetchHandler = {
  apply(target, _, args) {
    if((args as Array<string>).every(arg => arg === allowedUrl)) {
      return target(...args);
    } else {
      throw new Error("Not allowed URL");
    }
  },
};

fetch = new Proxy(fetch, fetchHandler);

await fetch("https://api.github.com/users/denoland");  // OK
await fetch("https://api.github.com/not/ok/url");  // Error

Run with:

$ ALLOWED_URL=https://api.github.com/users/denoland && deno run --allow-net=api.github.com --allow-env=ALLOWED_URL proxied_fetch.ts

(This might not be perfect I'm not familiar with this pattern)

andykais commented 1 year ago

even simpler would be just overriding the fetch global with a different function:

globalThis.fetch = (input: string, options: RequestInit) => {
  // do some validation
  return fetch(input, options)
}

I am curious though if someone could break the security here. Is it possible to access the inner fetch? I believe this fetch does get used by dynamic imports, so that does get us pretty far.

jsejcksn commented 1 year ago

@andykais In the code you've shown, the inner reference to fetch is only evaluated after the function is invoked, so it will refer to the most local declaration of fetch when that happens (see scope and shadowing) — if there is no local declaration of fetch, it will refer to globalThis.fetch, which is that very function, causing an infinite loop. Here's the same code, but with types stripped so that you can copy and paste directly into your browser's JS dev tools console to try it:

globalThis.fetch = (input, options) => {
  // do some validation
  return fetch(input, options);
};

fetch("https://example.com");

IMO, the problem here is that all of these workarounds must happen in user code, so there must be a guarantee that they are the first statements executed synchronously, else any imported code will have an opportunity to capture a reference to the original fetch and bypass the proxy.

Because of that, this would need to be handled by the Deno runtime so that it's finalized by the time the environment bootstrapping is done and no user code can escape the "sandbox" that's created.

If we're interested in sharing user implementations, it should probably be done in a separate place so that this issue can stay on topic — perhaps in a discussion? If someone decides to do that, feel free to link it here.

lucacasonato commented 1 year ago

You can not sandbox JS code from within JS. For all intents and proposes, the entire JS sandbox must be untrusted.