isso-comments / isso

a Disqus alternative
https://isso-comments.de
MIT License
5.04k stars 437 forks source link

Document CORS handling (was: ISSO_CORS_ORIGIN not working as advertised) #720

Open vsajip opened 3 years ago

vsajip commented 3 years ago

According to the documentation, you can use an ISSO_CORS_ORIGIN environment variable to set e.g. wildcards for the Access-Control-Allow-Origin CORS header. However, it doesn't work in practice, because the code that uses it is as shown in this fragment:

    def func(environ):
        if 'ISSO_CORS_ORIGIN' in environ:
            return environ['ISSO_CORS_ORIGIN']

However, the environ passed into func() is a restricted WSGI environment (request.environ) rather than the full process environment (os.environ). Therefore, the code should use os.environ in the above fragment rather than just environ. Or is there some other way that an ISSO_CORS_ORIGIN OS environment variable is supposed to find its way into a request.environ?

ix5 commented 2 years ago

Using ISSO_CORS_ORIGIN=*.example.tld as written in isso's documentation is seemingly against spec anyway, see this Stackoverflow answer.

I've prepared a branch that rectifies isso's behaviour, see ix5/isso@cors-from-os-environ but that will still result in browsers refusing wildcard subdomains.

The correct way to do this would be to check if the originating URL matches the wildcard of ISSO_CORS_ORIGIN and then return a fully qualified URL (with everything after and including the slash chopped off).

Lucas-C commented 2 years ago

This is strange...

On my side, it works fine

I can confirm that the solution I implemented in #393 does not work 😔

Lucas-C commented 2 years ago

@ix5 Maybe you could create a PR from your branch?

ix5 commented 2 years ago

CORS

First off, a bit of explanation how CORS is supposed to work, since there might have been a few misunderstandings that lead to this.

CORS stands for "Cross-Origin Resource Sharing". It is implemented as a series of HTTP headers called Acces-Control-Allow-Origin (and related ones, see code snippet below).

It is essentially a mechanism to avoid bad actors sniffing credentials or running non-idempotent actions such as PUT or POST by sending credentialed(!) XmlHttpRequests to your domain.

Example: You login to service.example.com, receive a login cookie. You visit evil.site, which makes an XmlHTTPRequest POST request to service.example.com/deleteAccount with {confirm: yes} using the browser's cookie jar for service.example.com.)

Note that modern browsers have other layers of security to prevent this.

Before the browser allows a request from a JS context (and others? not sure) to succeed, it will send an OPTIONS request with the originating URL to the target server, which is then supposed to check if the originating URL is in its allow-list for that resource and answer with an Access-Control-Allow-Origin header that matches the protocol+domain+port part of the originating URL (i.e. with path information stripped), or, if the URL is not allowed to request that content, the default server URL.

If you want to defeat the whole purpose of CORS, you can also return a wildcard with a literal * as the header content. Note that *.mydomain.com however is not a valid response!

For further information and a better explanation, see MDN: CORS.

Isso Implementation

Since Isso is commonly served from a subdomain or maybe even from other domains, CORS is needed for modern browsers.

The part that implements CORS in Isso is inside a "middleware" which modifies werkzeug requests on-the-fly:

https://github.com/posativ/isso/blob/568debf6737d2f3462bdd9baa5ace45031b8ed95/isso/wsgi.py#L125-L137

This appends the result of the origin() function call as a header to the request.

https://github.com/posativ/isso/blob/568debf6737d2f3462bdd9baa5ace45031b8ed95/isso/__init__.py#L134-L135

The list of hosts is constructed from your [general] host config list:

https://github.com/posativ/isso/blob/568debf6737d2f3462bdd9baa5ace45031b8ed95/isso/__init__.py#L204-L207

Finally, the origin() function checks if the request origin is in the list of allowed origins, else returns invalid.local or the first host in the allowlist:

https://github.com/posativ/isso/blob/568debf6737d2f3462bdd9baa5ace45031b8ed95/isso/wsgi.py#L66-L92

This is the critical part, Isso already does the right thing for you: https://github.com/posativ/isso/blob/568debf6737d2f3462bdd9baa5ace45031b8ed95/isso/wsgi.py#L86-L88

This compares the request (with path and query stripped) against the [general] host config items and returns the first match. You can verify this with the debug branch I've created: https://github.com/ix5/isso/tree/DEBUG-cors-uris

Conclusion:

Isso already does the right thing! There is no need to change anything and the introduction of the ISSO_CORS_ORIGIN env variable probably was, I'm very sorry to say, a misunderstanding on your part, @Lucas-C. I agree that the code is quite terse and if you like, you can add comments to make it more readable.


Options

I've thought about this for a bit and came to the following conclusion:

The most sane way of running Isso is either a) via a WSGI server or b) through a WSGI server behind a reverse proxy, which might or might not do TLS termination, or c) via the built-in isso server. Supporting other configurations should be "beware dragons" territory - not "forbidden", but not actively encouraged either.

For a) Isso will already know the request origin, so no issue here For b) Isso should know the request origin, since anything else than a transparent proxy is probably a misconfiguration and will bring problems later on. Example transparent proxy config:

proxy_set_header Host $host;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_pass http://127.0.0.1:8080;

At least the Host info should be passed to the proxy (=Isso). Isso cannot figure out the right thing if you don't let it know which environment it is running inside. Solving that shortage of information via config options as requested by @vsajip in https://github.com/posativ/isso/issues/735 is not smart and doesn't solve the root issue.

For c) Isso will also already know the request origin, also no issue here

It isn't worth it to introduce another config for people to pair hosts with CORS responses. The proper way is to pass the original request URI all the way through to Isso.

Decision

We should remove the ISSO_CORS_ORIGIN env variable and accompanying docs (you can revert both of those commits). Then, document that Isso will already do the correct thing and ensure consistency regarding proxies in the docs.

Lucas-C commented 2 years ago

Very detailed explanation, thanks @ix5 ! While I knew well about CORS, your comment is utterly accurate and a very nice refresher on how isso handle things.

For the record, I originally wanted to be able to use my self-hosted isso instance at chezsoi.org from a website hosted at subdomain.chezsoi.org. This was this original motivation behind #393.

Hence I think there is a valid use case behind ISSO_CORS_ORIGIN, but I'm OK with either fixing it or removing it.

ix5 commented 2 years ago

For the record, I originally wanted to be able to use my self-hosted isso instance at chezsoi.org from a website hosted at subdomain.chezsoi.org. This was this original motivation behind #393.

So you mean you wanted to avoid listing all possible sites where you would allow commenting and embed the embed.min.js from? I.e. not have this:

[general]
host =
    subdomain.chezsoi.org
    baguette.chezsoi.org
    chezlucas.chezsoi.org
    brioche.chezsoi.org
    [...]

But instead of ISSO_CORS_ORIGIN=*.chezsoi.org.

I can see why you would want that, since that syntax is valid for e.g. CSP headers, but IMO there aren't too many users out there who have more subdomains than they could list in the conf file.

Happy to be convinced otherwise, but that's where I currently stand on this.

Lucas-C commented 2 years ago

That's exactly what I wanted yes 😊

Could this host configuration entry by used for other purposes than CORS HTTP headers? For example, it is used in SMTPConnection and in the HTTP connection check in make_app. I wonder if adding hosts there could have unexpected consequences.

ix5 commented 2 years ago

Could this host configuration entry by used for other purposes than CORS HTTP headers? For example, it is used in SMTPConnection

[...]
self.conf = isso.conf.section("smtp")
[...]
with SMTPConnection(self.conf):
[...]
klass = (smtplib.SMTP_SSL if self.conf.get(
'security') == 'ssl' else smtplib.SMTP)
self.client = klass(host=self.conf.get('host'),
port=self.conf.getint('port'),
timeout=self.conf.getint('timeout'))

Different host, this one is from the [smtp] section: https://github.com/posativ/isso/blob/568debf6737d2f3462bdd9baa5ace45031b8ed95/share/isso.conf#L133-L148

and in the HTTP connection check in make_app.

Only uses the first reachable host for a one-time connectivity check.

I wonder if adding hosts there could have unexpected consequences.

That's... honestly never been a concern to me ;)

Lucas-C commented 2 years ago

Alright 😊

Then go and nuke that useless configuration setting !

Maybe you could also use some bits from your excellent comment above in order to better explain CORS handling in isso docs?

ix5 commented 2 years ago

Maybe you could also use some bits from your excellent comment above in order to better explain CORS handling in isso docs?

Would you mind helping out with that? I don't mean to bully you into spending time on this project you're ready to spare, but me committing most of the code to this project as of late, with practically no one pushing back or looking over the code, is something I'm not really comfortable with.

It isn't sustainable in my eyes, and as you've probably already experienced yourself, interest in OSS contribs comes and goes in waves ;)

Lucas-C commented 2 years ago

I'm OK to review a PR that would do the cleanup, by I cannot promise to take the time to submit it myself anytime soon 😝

ix5 commented 2 years ago

I'm OK to review a PR that would do the cleanup

The change is already merged, see https://github.com/posativ/isso/pull/803

[but] I cannot promise to take the time to submit it myself anytime soon

Don't worry, thank you for your review for the webpack and Jest migration so far, that's also much appreciated! The only thing missing now is an entry into the docs that kind of integrates my previous remarks about CORS.