webspecs / url

The URL specification
https://specs.webplatform.org/url/webspecs/develop/
Other
21 stars 9 forks source link

URL comparison methods #20

Closed marcoscaceres closed 9 years ago

marcoscaceres commented 9 years ago

It would be really nice if this API included common comparison algorithms for URL.

In particular, including a way to check if:

marcoscaceres commented 9 years ago

In particular, I need "subsumes" for Web Manifest's "navigation scope". The problem I'm trying to solve in manifests is described in more detail here HTML Bug 27653.

Ideally, I would like to just hook into the URL spec to do the comparison for me (even if it doesn't end up reflected in the API, though that would be sad because it seems useful).

Service Worker's may also benefit from having this algorithm, as they need something similar where they check if a fetch request is in scope of some URL space the a service worker controls. Ideally, both manifest and SWs would use the same algorithm to do this comparison.

rubys commented 9 years ago

Related: https://www.w3.org/Bugs/Public/show_bug.cgi?id=27640

I agree that working on an API seems useful. Do you know if there is any browser interest in adding support for this? Three methods that return boolean values should be easy enough to add. I'm thinking static methods on the URL object that accept two strings as parameters would be a better approach than instance methods on URLUtils. Does this seem reasonable?

marcoscaceres commented 9 years ago

On Friday, December 19, 2014, Sam Ruby notifications@github.com wrote:

Related: https://www.w3.org/Bugs/Public/show_bug.cgi?id=27640

I agree that working on an API seems useful. Do you know if there is any browser interest in adding support for this?

Yes. I work for Mozilla's DOM team and we would be interested.

I can ask other browser folks to chime in if needed.

Three methods that return boolean values should be easy enough to add. I'm thinking static methods on the URL object that accept two strings as parameters would be a better approach than instance methods on URLUtils. Does this seem reasonable?

Sounds good - I was thinking the same. However, they could accept either a URL object or string. URL object might be easier tho; so you don't need to deal with relative URLs etc.

— Reply to this email directly or view it on GitHub https://github.com/webspecs/url/issues/20#issuecomment-67633412.

rubys commented 9 years ago

I work for Mozilla's DOM team and we would be interested.

Cool! Would I be pushing my luck to suggest a that you follow this advice :smirk: ?

URL object might be easier tho

I'm concerned about implicitly encouraging people to violate this advice.

Finally, I want to capture that you have made a concrete proposal for method names. Thanks! This seems to have generated a productive discussion. Whether in the form of a pull request or simply comments, I encourage the results to be brought back here.

marcoscaceres commented 9 years ago

Cool! Would I be pushing my luck to suggest a that you follow this advice :smirk: ?

Not at all, I can have a go at it.

I'm concerned about implicitly encouraging people to violate this advice.

Good point. Besides, saying it takes a string (rather than an object or string) will just coerce a URL object into a string, which is fine. Gives the same results :+1:

Finally, I want to capture that you have made a concrete proposal for method names.

Sounds like a plan.

marcoscaceres commented 9 years ago

So, instead of "subsumes", it might be easier to talk about some URL being "within scope" of some other URL. Testing Chrome's service worker implementation, if URL B's .pathname starts with URL's A's .pathname, it is "within scope":

So:

//the scope algorithm could be:
URL.inScope = function(a,b){
   if (a === undefined || b === undefined){
      return false; 
   }
   //either A or B can throw. 
   var A = new URL(String(a), window.location); 
   var B = new URL(String(b), window.location);
   return (A.origin === B.origin && B.pathname.startsWith(A.pathname)); 
}

//Given `a` as defining the scope 
a = "/t";

//These are all in scope of `a`
["/test-test/t", 
"/test",
"/test-test",
"/test-test/test"].every(i => URL.inScope(a, i)); //true

//These are all out of scope
[ "z",
 undefined,
 "z-test",
 "xyz123"
].every(i => !URL.inScope(a, i)); //true
marcoscaceres commented 9 years ago

I'll just note that there is a bit of weirdness that a scope of "/f" means that anything that starts with "/f" is in scope... that might catch people out who are expecting scope to work like a directory on a file system. It's easy to fix, of course, by just adding "/" on the end of whatever is being scoped.

rubys commented 9 years ago

Path is defined as an array of strings: https://url.spec.whatwg.org/#concept-url-path

marcoscaceres commented 9 years ago

Ok, so the question then becomes if we want to do segment comparisons (i.e, "starts with same segments") rather than string comparison? That would address the issue above, but would mean that Chrome's, and possibly Moz's, implementation would be wrong.

rubys commented 9 years ago

The spec should be defined in terms of segment comparisons; but that doesn't necessarily make implementations that do string comparisons wrong. See the third paragraph here: https://url.spec.whatwg.org/#conformance

marcoscaceres commented 9 years ago

Sure, but what I'm still unsure about is if path "foo" is in scope of "f".

It would not be if we did segment comparison, unless it's specified as: beings with the same segments, and the last segment of B string matches the last segment of A OR the characters of the last segment of A start with the characters of the last segment of B.

If we do purely segment comparison (B starts with the same segments as A), it would make the implementations incorrect.

To be sure: I don't know which of the two above is correct. That's what I'm hoping we can reach consensus on and add to the spec.

jakearchibald commented 9 years ago

/foo/ is in scope /f. As long as it matches the start of the pathname, it matches.

marcoscaceres commented 9 years ago

Thanks @jakearchibald! :heart:

rubys commented 9 years ago

In which case, the definition should make use of https://url.spec.whatwg.org/#dom-urlutils-pathname

kenchris commented 9 years ago

inScope -> withinScopeOf?