w3c / beacon

Beacon
https://w3c.github.io/beacon/
Other
46 stars 22 forks source link

switch to whitelist for corsMode #34

Closed igrigorik closed 7 years ago

igrigorik commented 7 years ago

Followup to https://github.com/w3c/beacon/pull/33#issuecomment-229896895.

@annevk @toddreifsteck ptal.

annevk commented 7 years ago

Didn't you want to match XMLHttpRequest and fetch() by also allowing ArrayBuffer and friends? I wonder if they also allow some variant of Blob as long as the MIME type matches one of the allowed ones...

toddreifsteck commented 7 years ago

Yes, ArrayBuffer should be part of the whitelist. Blob will be processed by the no-cors algorithm within Fetch... The only downside to doing it this way is that it will not be redirectable like other input types.

igrigorik commented 7 years ago

@toddreifsteck @annevk good catch, added ArrayBuffer to the "no-cors" clause. Anything else?

annevk commented 7 years ago

You didn't cover the "and friends" part. I guess I should take some time to review this part of Fetch again, it's apparently a little flaky.

annevk commented 7 years ago

Yeah, it seems like Fetch has a bug. It sets the Content-Type header after the header check. 😟

annevk commented 7 years ago

Oh no, it's fine. It uses an append operation that's guarded.

annevk commented 7 years ago

So yeah, fetch() allows a Blob as long as its MIME type is safelisted.

igrigorik commented 7 years ago

@annevk which other friends are there?

Re, "MIME type is safelisted" are you referring to reading the response? Or, is there an equivalent check (which I can't find) for when request is made?

annevk commented 7 years ago

@igrigorik Uint8Array and such?

No, I'm referring to the value of the Content-Type header that the MIME type ends up in. And whether that header/value pair is safelisted or not.

igrigorik commented 7 years ago

@annevk ah, in which case BufferSource should do the trick, right? I don't think we'd want to enumerate all the various typed array view types.

Re, safelisted mimetype: https://fetch.spec.whatwg.org/#cors-safelisted-request-header - that?

Let <var>corsMode</var> be set to "<code>no-cors</code>" if
object's type is one of <code>BufferSource</code>,
<code>FormData</code>, <code>URLSearchParams</code>, 
<code>USVString</code>, or <code>Blob</code> with a 
<a href="https://fetch.spec.whatwg.org/#cors-safelisted-request-header">CORS-safelisted <var>mimeType</var></a>
value. Otherwise, let <var>corsMode</var> be set to "cors".

Getting closer?

annevk commented 7 years ago

What would you think about not switching corsMode and just always use "no-cors" and error on erroneous MIME types?

Basically you extract the bytes and MIME type per the same algorithm as in Fetch.

Then you create a Fetch "header" whose "name" is Content-Type and value is (the extracted) MIME type. Then you check whether that is a "CORS-safelisted request header". If it's not, throw. Otherwise, proceed.

igrigorik commented 7 years ago

That would certainly simplify the spec but it would also disallow cors-mode requests via sendBeacon.. If an origin is willing to do the CORS dance, why disallow sendBeacon from supporting that use case?

(updated pull to use BufferSource)

annevk commented 7 years ago

Because the way you go about it is fairly magical and not adding much value. If you want sendBeacon() to just be form, I'd rather have it be equally simple.

annevk commented 7 years ago

(Even if you wanted to keep "cors", which I no longer think we should do, you can still switch to my approach but rather than fail just switch the mode.)

toddreifsteck commented 7 years ago

I believe #33 resolves the block vs. allow list issue. The discussion of how to cleanly spec sendBeacon on top of fetch seems like a future discussion that we should take up in another thread.

annevk commented 7 years ago

Well, #33 was resolved with late objections. If you want to move this discussion elsewhere again, be my guest, but it's getting rather frustrating.

And this is not a discussion of how to layer things on top of Fetch, this is a discussion about functionality and utility.

igrigorik commented 7 years ago

Because the way you go about it is fairly magical and not adding much value. If you want sendBeacon() to just be form, I'd rather have it be equally simple.

I don't believe "be like form" is or was our explicit goal. From the discussion on the call earlier this week, the main risk here is that we already have sendBeacon out in the wild and changing it to disallow Blob's runs the risk that we'll break existing deployments. Also, neither Chrome nor Edge has existing telemetry to track which content types are being sent.. so, we'd be making a blind decision. Does FF have something that could help us make the call?

In absence of data to answer the above, I think we can still make progress here as you suggested, but allow cors workflow after checking the content type. Based on your previous comments:

Is that consistent with what you're thinking?

annevk commented 7 years ago

Yes, but I would prefer not falling back to "cors" for such an obscure case.

igrigorik commented 7 years ago

@annevk does FF have any telemetry on existing content-type usage for sendBeacon?

In absence of real-world usage data my proposal is we adopt the above language that allows "cors" to help normalize behavior between various implementations. As a non-blocking item: add instrumentation to gather telemetry on usage and then revisit the decision to disallow "cors" once we have some numbers.

annevk commented 7 years ago

I don’t think we do. Chrome does have CORS telemetry, but might be too generic.

igrigorik commented 7 years ago

@annevk thanks. Are you OK with the proposal in my earlier comment, until we get some real-world telemetry?

annevk commented 7 years ago

Yeah, modulo some minor wording issues that seems good.

igrigorik commented 7 years ago

@annevk @toddreifsteck updated, how does that look?

annevk commented 7 years ago

I guess we can land this modulo nits, but we should open an issue to change the language (if needed) based on how https://bugs.chromium.org/p/chromium/issues/detail?id=490015 gets resolved.

igrigorik commented 7 years ago

@annevk updated, thanks for the help on this one.

@toddreifsteck can you please review and merge if it all looks good from your end?

tyoshino commented 7 years ago

The change looks good.

Also, neither Chrome nor Edge has existing telemetry to track which content types are being sent.. so, we'd be making a blind decision. Does FF have something that could help us make the call?

I'm adding UMA for http://crbug.com/490015 at https://codereview.chromium.org/2184973002/

ArrayBuffer

Chrome's sendBeacon has been sending "Content-Type: application/octet-stream" since it was shipped in the middle of 2014 which is not simple (I just noticed it yesterday).

igrigorik commented 7 years ago

@tyoshino awesome, curious to see the data!

igrigorik commented 7 years ago

Based on data for only a few days, but we're seeing that ~5% of sendBeacon user pages are using a Blob with non-simple content type for the body to send. -- tyoshino

That's more than I would have guessed.. I think what we have in this pull is the right solution; I don't think we can drop 'cors' case.

plehegar commented 7 years ago

LGTM

bzbarsky commented 7 years ago

@plehegar This didn't update https://w3c.github.io/beacon/#privacy to match the normative spec, right?

annevk commented 7 years ago

Filed #43 on that section.

plehegar commented 7 years ago

@annevk did the right thing to file a new issue.

bzbarsky commented 7 years ago

I wasn't sure whether I was just misreading things is all.