w3c / web-share-target

Web API proposal for receiving shared data
https://w3c.github.io/web-share-target/
Other
191 stars 20 forks source link

Define and use reserved percent-encode set #32

Closed mgiuca closed 6 years ago

mgiuca commented 6 years ago

This fixes the set of characters that are encoded when substituting into a placeholder, to include all the "reserved" and illegal characters from RFC 2396 (matching the behaviour of encodeURIComponent).

There is no appropriate set to use in the URL Standard. whatwg/url#369 discusses adding this (also needed for properly specifying registerProtocolHandler), but for now, I'm just putting it in the Web Share Target spec.

Closes #9.


Preview | Diff

mgiuca commented 6 years ago

Hey Marcos,

I have a bunch of upcoming changes to Web Share Target. Do you want me to run them by you, or should I just get some people at Google to review? I don't want to bother you.

Also @annevk --- this makes a start towards whatwg/url#369. I know this should be defined in URL Standard but I'd like to get WST working properly first. Then I'll work towards migrating it upstream.

marcoscaceres commented 6 years ago

Do you want me to run them by you, or should I just get some people at Google to review? I don't want to bother you.

Sure! It's not a bother at all - but I'm not completely up to speed on the spec. I'm always happy to do an initial review of things if you need another pair of eyes.

mgiuca commented 6 years ago

OK I'll wait and see if @annevk has any thoughts about this. Note that I think we could spend awhile debating the exact set of characters to put into this set. Since I have other changes I'd like to make, I'd prefer to land this first and then discuss the character set later.

mgiuca commented 6 years ago

@ericwilligers PTAL

ericwilligers commented 6 years ago

Need to omit ! * ' ( ) as these are in the "unreserved set".

mgiuca commented 6 years ago

@ericwilligers Thanks.

Need to omit ! * ' ( ) as these are in the "unreserved set".

For clarity: in the RFC 2396 "unreserved set".

The problem (as discussed offline, but just for anyone else's benefit) is that I made the list based on the RFC 3986 unreserved set, but I was intending to use the RFC 2396 set.

The reason to use the older RFC 2396 set is that encodeURIComponent uses that set, as does Chrome's generic "escape a string for a URL" function (implying that use of that set is pretty common). We don't need to use that function in Chrome, if we decide to use a different set here, but the path of least resistance seems to be using that (smaller) set. So I will remove those 5 characters.

annevk commented 6 years ago

If you actually want the URL parser to change to use this I think it would make more sense to a depth-first approach since that may or may not be successful. If you want to do this regardless of whether that can change I suppose this makes sense...

mgiuca commented 6 years ago

@annevk I think you are saying that if I don't want a special exception here forever, I should make the change in URL Standard first, before doing it here.

That's fair, but since this is a draft (which will go through many rounds of review), I think it's fine to have it here now. The possible directions we go on after this are:

  1. The reserved set (possibly with changes) gets added to URL Standard, and then we refer to it from here, or
  2. We find a way to use an existing standard (perhaps we directly call ECMAScript's encodeURIComponent) so we don't need to define a separate set, or
  3. We decide to just leave the set in this spec as an isolated thing. (Though that is not ideal, and it should at least share the list of codepoints with registerProtocolHandler).

All three of those can happen, so I'll just land this for now.