simonw / datasette

An open source multi-tool for exploring and publishing data
https://datasette.io
Apache License 2.0
9.2k stars 656 forks source link

Make sure CORS works for write APIs #1922

Closed simonw closed 1 year ago

simonw commented 1 year ago

Split from:

simonw commented 1 year ago

I tried running fetch() with a POST from a separate domain and got a browser error because it did a GET against the /db/-/create endpoint and the 405 method not supported response did not include the CORS headers.

simonw commented 1 year ago

Here's why:

https://github.com/simonw/datasette/blob/4ddd77e51254bda3bac990ea662bac2e6b29c5e0/datasette/views/base.py#L71-L79

That's code in BaseView - but it turns out the code that adds CORS headers is in the DataView subclass of that (which the various write API endpoints do not use).

https://github.com/simonw/datasette/blob/4ddd77e51254bda3bac990ea662bac2e6b29c5e0/datasette/views/base.py#L158-L162

simonw commented 1 year ago

I'll test this once it's deployed to https://latest.datasette.io/

simonw commented 1 year ago

Two test failures:

____________________________ test_homepage_options _____________________________
[gw0] linux -- Python 3.11.0 /opt/hostedtoolcache/Python/3.11.0/x64/bin/python

app_client = <datasette.utils.testing.TestClient object at 0x7f4c489269d0>

    def test_homepage_options(app_client):
        response = app_client.get("/", method="OPTIONS")
>       assert response.status == 405
E       assert 200 == 405
E        +  where 200 = <datasette.utils.testing.TestResponse object at 0x7f4c4892f4d0>.status

/home/runner/work/datasette/datasette/tests/test_html.py:58: AssertionError
______________________ test_client_methods[options-/-405] ______________________
[gw1] linux -- Python 3.11.0 /opt/hostedtoolcache/Python/3.11.0/x64/bin/python

datasette = <datasette.app.Datasette object at 0x7fc33c227550>
method = 'options', path = '/', expected_status = 405

    @pytest.mark.asyncio
    @pytest.mark.parametrize(
        "method,path,expected_status",
        [
            ("get", "/", 200),
            ("options", "/", 405),
            ("head", "/", 200),
            ("put", "/", 405),
            ("patch", "/", 405),
            ("delete", "/", 405),
        ],
    )
    async def test_client_methods(datasette, method, path, expected_status):
        client_method = getattr(datasette.client, method)
        response = await client_method(path)
        assert isinstance(response, httpx.Response)
>       assert response.status_code == expected_status
E       assert 200 == 405
E        +  where 200 = <Response [200 OK]>.status_code

/home/runner/work/datasette/datasette/tests/test_internals_datasette_client.py:29: AssertionError
=============================== warnings summary ===============================
tests/test_cli.py::test_inspect_cli_writes_to_file
tests/test_cli.py::test_inspect_cli
  /home/runner/work/datasette/datasette/datasette/cli.py:163: DeprecationWarning: There is no current event loop
    loop = asyncio.get_event_loop()

tests/test_cli_serve_get.py: 2 warnings
tests/test_cli.py: 12 warnings
tests/test_crossdb.py: 1 warning
  /home/runner/work/datasette/datasette/datasette/cli.py:591: DeprecationWarning: There is no current event loop
    asyncio.get_event_loop().run_until_complete(ds.invoke_startup())

tests/test_cli_serve_get.py: 2 warnings
tests/test_cli.py: 12 warnings
tests/test_crossdb.py: 1 warning
  /home/runner/work/datasette/datasette/datasette/cli.py:594: DeprecationWarning: There is no current event loop
    asyncio.get_event_loop().run_until_complete(check_databases(ds))

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
=========================== short test summary info ============================
FAILED tests/test_html.py::test_homepage_options - assert 200 == 405
 +  where 200 = <datasette.utils.testing.TestResponse object at 0x7f4c4892f4d0>.status
FAILED tests/test_internals_datasette_client.py::test_client_methods[options-/-405] - assert 200 == 405
 +  where 200 = <Response [200 OK]>.status_code
====== 2 failed, 1195 passed, 1 skipped, 32 warnings in 191.06s (0:03:11) ======
Error: Process completed with exit code 1.

On reading https://developer.mozilla.org/en-US/docs/Web/HTTP/Methods/OPTIONS I feel like I should be a bit more thoughtful about how I treat OPTIONS - maybe it should work for every URL on the site, but return a 204 No Content header?

Comparing a few different sites:

~ % curl -X OPTIONS https://www.google.com/ -i
HTTP/2 405 
allow: GET, HEAD
date: Wed, 30 Nov 2022 18:18:15 GMT
content-type: text/html; charset=UTF-8
server: gws
content-length: 1592
x-xss-protection: 0
x-frame-options: SAMEORIGIN
alt-svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000,h3-Q050=":443"; ma=2592000,h3-Q046=":443"; ma=2592000,h3-Q043=":443"; ma=2592000,quic=":443"; ma=2592000; v="46,43"

<!DOCTYPE html>
<html lang=en>
  <meta charset=utf-8>
  <meta name=viewport content="initial-scale=1, minimum-scale=1, width=device-width">
  <title>Error 405 (Method Not Allowed)!!1</title>
  <style>
    *{margin:0;padding:0}html,code{font:15px/22px arial,sans-serif}html{background:#fff;color:#222;padding:15px}body{margin:7% auto 0;max-width:390px;min-height:180px;padding:30px 0 15px}* > body{background:url(//www.google.com/images/errors/robot.png) 100% 5px no-repeat;padding-right:205px}p{margin:11px 0 22px;overflow:hidden}ins{color:#777;text-decoration:none}a img{border:0}@media screen and (max-width:772px){body{background:none;margin-top:0;max-width:none;padding-right:0}}#logo{background:url(//www.google.com/images/branding/googlelogo/1x/googlelogo_color_150x54dp.png) no-repeat;margin-left:-5px}@media only screen and (min-resolution:192dpi){#logo{background:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) no-repeat 0% 0%/100% 100%;-moz-border-image:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) 0}}@media only screen and (-webkit-min-device-pixel-ratio:2){#logo{background:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) no-repeat;-webkit-background-size:100% 100%}}#logo{display:inline-block;height:54px;width:150px}
  </style>
  <a href=//www.google.com/><span id=logo aria-label=Google></span></a>
  <p><b>405.</b> <ins>That’s an error.</ins>
  <p>The request method <code>OPTIONS</code> is inappropriate for the URL <code>/</code>.  <ins>That’s all we know.</ins>
~ % curl -X OPTIONS https://www.mozilla.org/ -i
HTTP/2 405 
content-type: text/html; charset=utf-8
content-length: 0
server: meinheld/1.0.2
date: Wed, 30 Nov 2022 18:18:38 GMT
allow: GET, HEAD
x-frame-options: DENY
content-security-policy: child-src 'self' *.mozilla.net *.mozilla.org *.mozilla.com www.googletagmanager.com www.google-analytics.com www.youtube-nocookie.com trackertest.org www.surveygizmo.com accounts.firefox.com accounts.firefox.com.cn www.youtube.com; connect-src 'self' *.mozilla.net *.mozilla.org *.mozilla.com www.googletagmanager.com www.google-analytics.com region1.google-analytics.com logs.convertexperiments.com 1003350.metrics.convertexperiments.com 1003343.metrics.convertexperiments.com sentry.prod.mozaws.net o1069899.sentry.io o1069899.ingest.sentry.io https://accounts.firefox.com/ stage.cjms.nonprod.cloudops.mozgcp.net cjms.services.mozilla.com; frame-src 'self' *.mozilla.net *.mozilla.org *.mozilla.com www.googletagmanager.com www.google-analytics.com www.youtube-nocookie.com trackertest.org www.surveygizmo.com accounts.firefox.com accounts.firefox.com.cn www.youtube.com; script-src 'self' *.mozilla.net *.mozilla.org *.mozilla.com 'unsafe-inline' 'unsafe-eval' www.googletagmanager.com www.google-analytics.com tagmanager.google.com www.youtube.com s.ytimg.com cdn-3.convertexperiments.com app.convert.com data.track.convertexperiments.com 1003350.track.convertexperiments.com 1003343.track.convertexperiments.com; img-src 'self' *.mozilla.net *.mozilla.org *.mozilla.com data: mozilla.org www.googletagmanager.com www.google-analytics.com adservice.google.com adservice.google.de adservice.google.dk creativecommons.org cdn-3.convertexperiments.com logs.convertexperiments.com images.ctfassets.net ad.doubleclick.net; style-src 'self' *.mozilla.net *.mozilla.org *.mozilla.com 'unsafe-inline' app.convert.com; default-src 'self' *.mozilla.net *.mozilla.org *.mozilla.com; font-src 'self'
cache-control: max-age=600
expires: Wed, 30 Nov 2022 18:28:38 GMT
x-backend-server: bedrock-prod-web-b95bc569d-grd25.iowa-a
strict-transport-security: max-age=31536000
x-content-type-options: nosniff
x-xss-protection: 1; mode=block
referrer-policy: strict-origin-when-cross-origin
via: 1.1 google, 1.1 6c90b631453c435bd0022caa657b67e8.cloudfront.net (CloudFront)
x-cache: Error from cloudfront
x-amz-cf-pop: SFO5-P2
x-amz-cf-id: A6-9mLztaE2tz840CbV9bXYiBMZRKEamDj6jGGEl1U7sg8egWfsDqg==

~ % curl -X OPTIONS https://example.com -i     
HTTP/2 200 
allow: OPTIONS, GET, HEAD, POST
cache-control: max-age=604800
content-type: text/html; charset=UTF-8
date: Wed, 30 Nov 2022 18:18:59 GMT
expires: Wed, 07 Dec 2022 18:18:59 GMT
server: EOS (vny/0451)
content-length: 0
simonw commented 1 year ago

Weird, GitHub reply with a 404!

~ % curl -X OPTIONS https://github.com/ -i
HTTP/2 404 
server: GitHub.com
date: Wed, 30 Nov 2022 18:19:39 GMT
content-type: text/html; charset=utf-8
content-length: 0
strict-transport-security: max-age=31536000; includeSubdomains; preload
x-frame-options: deny
x-content-type-options: nosniff
x-xss-protection: 0
referrer-policy: origin-when-cross-origin, strict-origin-when-cross-origin
content-security-policy: default-src 'none'; base-uri 'self'; block-all-mixed-content; child-src github.com/assets-cdn/worker/ gist.github.com/assets-cdn/worker/; connect-src 'self' uploads.github.com objects-origin.githubusercontent.com www.githubstatus.com collector.github.com raw.githubusercontent.com api.github.com github-cloud.s3.amazonaws.com github-production-repository-file-5c1aeb.s3.amazonaws.com github-production-upload-manifest-file-7fdce7.s3.amazonaws.com github-production-user-asset-6210df.s3.amazonaws.com cdn.optimizely.com logx.optimizely.com/v1/events; font-src github.githubassets.com; form-action 'self' github.com gist.github.com objects-origin.githubusercontent.com; frame-ancestors 'none'; frame-src viewscreen.githubusercontent.com notebooks.githubusercontent.com; img-src 'self' data: github.githubassets.com media.githubusercontent.com camo.githubusercontent.com identicons.github.com avatars.githubusercontent.com github-cloud.s3.amazonaws.com objects.githubusercontent.com objects-origin.githubusercontent.com secured-user-images.githubusercontent.com/ opengraph.githubassets.com github-production-user-asset-6210df.s3.amazonaws.com customer-stories-feed.github.com spotlights-feed.github.com; manifest-src 'self'; media-src github.com user-images.githubusercontent.com/ secured-user-images.githubusercontent.com/; script-src github.githubassets.com; style-src 'unsafe-inline' github.githubassets.com; worker-src github.com/assets-cdn/worker/ gist.github.com/assets-cdn/worker/
vary: Accept-Encoding, Accept, X-Requested-With
x-github-request-id: DD6B:5ACA:102E8A6:1164A99:63879EBB
simonw commented 1 year ago

Started a conversation about how OPTIONS should work on Mastodon: https://fedi.simonwillison.net/@simon/109434148676475291

simonw commented 1 year ago

@simon IMO, it should always be a 2XX series response, typically with no content & an extra Allow header with a list of HTTP verbs it responds to.

https://mastodon.social/@daniellindsley/109434186252099323

simonw commented 1 year ago

Here's what Django Rest Framework does: https://github.com/encode/django-rest-framework/blob/1ae812ea209392ad76cc5d2f35f9f7fb337f63e4/rest_framework/views.py#L514-L521

    def options(self, request, *args, **kwargs):
        """
        Handler method for HTTP 'OPTIONS' request.
        """
        if self.metadata_class is None:
            return self.http_method_not_allowed(request, *args, **kwargs)
        data = self.metadata_class().determine_metadata(request, self)
        return Response(data, status=status.HTTP_200_OK)

That default determine_metadata method looks like this: https://github.com/encode/django-rest-framework/blob/1ae812ea209392ad76cc5d2f35f9f7fb337f63e4/rest_framework/metadata.py#L61-L71

    def determine_metadata(self, request, view):
        metadata = OrderedDict()
        metadata['name'] = view.get_view_name()
        metadata['description'] = view.get_view_description()
        metadata['renders'] = [renderer.media_type for renderer in view.renderer_classes]
        metadata['parses'] = [parser.media_type for parser in view.parser_classes]
        if hasattr(view, 'get_serializer'):
            actions = self.determine_actions(request, view)
            if actions:
                metadata['actions'] = actions
        return metadata
simonw commented 1 year ago

Still getting a CORS error:

image

My hunch is this is because I'm not sending Access-Control-Allow-Methods: GET,HEAD,POST.

simonw commented 1 year ago

That notebook:

viewof token = Inputs.text({
  label: "Your API token"
})
viewof createResponse = Inputs.button("Create table", {
  value: null,
  reduce: async () => {
    const response = await fetch(
      "https://latest.datasette.io/ephemeral/-/create",
      {
        method: "POST",
        mode: "cors",
        headers: {
          Authorization: `Bearer {$token}`,
          "Content-Type": "application/json"
        },
        body: JSON.stringify({
          table: "my_new_table",
          row: {
            task: "Demonstrate a JSON creation from another domain"
          }
        })
      }
    );
    return await response.json();
  }
})

Based on this tip: https://talk.observablehq.com/t/best-pattern-for-click-here-to-submit-your-results-to-an-api-backend/7353/3

simonw commented 1 year ago

I just shipped this:

Access-Control-Allow-Methods: GET, POST, HEAD, OPTIONS

I'll try this out on latest.datasette.io - but I need to research more to check if this is a safe thing to do or not.

simonw commented 1 year ago

Still getting a CORS header.

I tried it in Chrome, which outputs helpful messages to the console:

image
simonw commented 1 year ago

That worked for the preflight request - got this now:

image

So it looks like error responses (in this case for permission denied) are missing CORS headers.