WICG / compression-dictionary-transport

Other
92 stars 8 forks source link

Escape character and ? for URL matching #42

Closed horo-t closed 9 months ago

horo-t commented 11 months ago

In Chromium, we are using the MatchPattern() method to process the URL-matching.

The MatchPattern() method supports both ? and *. (? matches 0 or 1 character. And * matches 0 or more characters.) Also the backslash character (\) can be used as an escape character for * and ?.

The current proposal's Dictionary URL matching doesn't support \. Also it doesn't support ?.

I think ? is useful. But ? is used in URLs before URL-query string. So I think we should support both ? and \.

pmeenan commented 11 months ago

AFAIK, MatchPattern() isn't spec'd anywhere and the benefit would be mostly for Chromium implementation. The * wildcard with no escaping is easy enough to describe and implement and should handle all of the use cases that we have come up with. Not including escaping makes it easier for the humans who will likely be specifying the paths as well.

There's a chance we haven't come across a use case where a single-character wildcard is necessary.

I'd be more inclined to support filesystem path-like wildcards if there was an existing RFC or precedence in other standards for doing it since we're already doing some level of path-relative expansion but I haven't been able to find any.

horo-t commented 11 months ago

Do you think using the pattern of URLPattern API could be another option?

URLPattern API supports regular expression, but regular expression is too powerful. The new proposal of static routing API for Service Worker (explainer) is using URLPattern, and the current proposal is prohibiting using regexp type tokens. I think we should also prohibit using regexp type tokens for compression dictionary transport.

+CC: @yoshisatoyanagisawa @wanderview @sisidovski

yoshisatoyanagisawa commented 11 months ago

For preventing regexp, we followed how URLPattern is used for Tabbed mode home tab scope. (crbug.com/1381374)

I am not sure URLPattern's wildcard is alined with POSIX.2 2.13 Pattern Matching Notation, which might be used for path name expansion in Unix shell. https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_06

pmeenan commented 11 months ago

I filed an issue with URLPattern to see if it would make sense to split out a good chunk of the spec into a RFC. As it stands right now, trying to pull the existing URLPattern spec language into the compression dictionary ID would be way more complicated than it was worth but if there was a RFC-standardized way to specify the patterns it would be trivial.

I don't know that we need most of the functionality that it provides for this use case but the flexibility won't hurt either (as long as clients implementing the pattern matching support URLPattern already).

domenic commented 10 months ago

Oh, I just filed #48 about using URLPattern. Please do that! As noted in https://github.com/WICG/urlpattern/issues/180, the format of the spec is not an issue.