nodeSolidServer / node-solid-server

Solid server on top of the file-system in NodeJS
https://solidproject.org/for-developers/pod-server
Other
1.78k stars 302 forks source link

Needed fix for corsProxy (server operators must read) #1768

Open RubenVerborgh opened 8 months ago

RubenVerborgh commented 8 months ago

Action required

If you are an NSS server operator, please check that your settings use the default "corsProxy": false. If you have a public facing server with "corsProxy": true, please change it to "corsProxy": false until the suggested fix below is deployed.

Fix

The CORS proxy needs to be changed as follows:

csarven commented 8 months ago
  • If no Origin field present in the HTTP request, respond with a 400 or similar.

I can see this as an important to limit unintended use but it needs some testing / scenarios. The endpoint would need to require authentication any way so that it is not a public proxy - which is the main issue for limiting its use to legitimate requests, and in which case, the request is probably going to include the Origin any way.

  • If the Origin value in the request is not the server's configured domain (podhost.example) or a direct subdomain thereof (alice.podhost.example), respond with 400 or similar.

That'd couple an application that's making the request from an origin with the one that's same as the server. Essentially only allowing the proxy to be used by applications hosted on the same domain. This doesn't seem appropriate or desirable generally speaking but may be something a particular server may wish to limit. Perhaps needs a separate flag to make that distinction (whatever is the default).

  • If, after satistying the above two conditions, the response to the downstream server does not indicate an RDF content type in its headers (such as Turtle, HTML, etc.), respond with 400.
    • In particular, images, videos, PDFs etc. must result in a 400.
    • The connection to the downstream server can and should be closed prematurely if the content type is not RDF.

Why? This would potentially leak the original Origin of the request when a document includes embedded content.

jeff-zucker commented 8 months ago

What is the motivation for this change? Are servers reporting overuse? Is there a security issue?

ewingson commented 8 months ago

I'm not sure if I'm Forrest Gump or alike, but in the meantime I added that requested variable to config.json

csarven commented 8 months ago

There are different concerns.

One is about requiring authentication on the proxy endpoint: https://github.com/nodeSolidServer/node-solid-server/issues/1769 .

Another concern is making sure community server providers that are making a proxy available have a good handle on the offer they are making for their users, which runs into the risk of violating their ToS: https://github.com/solid/solidcommunity.net/issues/73 .

The details on the HTTP interaction is a separate open discussion.

bourgeoa commented 8 months ago

@jeff-zucker could you give one or more of your real use case.

Would defining a whitelist of valid origins for the CORS proxy be a response and to only allow requests to be proxied to origins on that list. Could this whitelist be the NSS trusted App origin that is stored in the extended profile ?

jeff-zucker commented 8 months ago

@jeff-zucker could you give one or more of your real use case.

Retrieving any RSS feed. Retrieving ontologies housed on CORS-blocking servers.

Would defining a whitelist of valid origins for the CORS proxy be a response and to only allow requests to be proxied to origins on that list. Could this whitelist be the NSS trusted App origin that is stored in the extended profile ?

The proxy is offered by the server, not a particular pod.

timbl commented 7 months ago

Maybe make the proxy base RI a capability URI that only a logged in user will know -- but not requiring authentication.

csarven commented 7 months ago

Besides the unrealistic case where the user manually enters the proxy base URI into their application, the application needs to be able to discover the proxy base URI.

I'm not a fan of the idea of proxy not requiring authentication because the URL could potentially be leaked. But, for that to work, and to somewhat minimise potential exposure, need to do somethign like:

For discovery, see also:


That said, is there a particular (implementation specific?) reason why the proxy resource can't require authentication?