qt4cg / qtspecs

QT4 specifications
https://qt4cg.org/
Other
27 stars 15 forks source link

Function for URI relativization #269

Open namedgraph opened 1 year ago

namedgraph commented 1 year ago

Signature:

relativize($uri as xs:anyURI, $base as xs:anyURI) as xs:anyURI

Example: URI::relativize in Java

dnovatchev commented 1 year ago

One possible pure Xpath 3.0 implementation:

let $pUrl := "file:/folder1/file2.txt",
    $pBase := "file:/folder1/folder2/file1.txt",
    $urlSegments := tokenize($pUrl, '/'),
    $baseSegments := tokenize($pBase, '/'),
    $idiff := (for $ind in 1 to max((count($urlSegments), count($baseSegments)))
                return $ind[$urlSegments[$ind] ne $baseSegments[$ind]]
             ) [1]
  return
   string-join(
                ($baseSegments[position() lt $idiff] ! '..',
                 $urlSegments[position() ge $idiff] )
               , '/')

And this produces the wanted, correct result:

../../file2.txt

yamahito commented 1 year ago

@dnovatchev Shouldn't the desired result be ../file2.txt?

namedgraph commented 1 year ago

Indeed something's off here.

System.out.println(URI.create("file:/folder1/folder2/file1.txt").resolve("../../file2.txt").toString());

Output:

file:/file2.txt
dnovatchev commented 1 year ago

Sorry, we need a set of examples of the pairs of URLs and the wanted results.

I am not 100% sure, but think the following code maybe provides the wanted fix:

let $pUrl := "file:/folder1/file2.txt",
    $pBase := "file:/folder1/folder2/file1.txt",
    $urlSegments := tokenize($pUrl, '/'),
    $baseSegments := tokenize($pBase, '/'),
    $idiff := (for $ind in 1 to max((count($urlSegments), count($baseSegments)))
                return $ind[$urlSegments[$ind] ne $baseSegments[$ind]]
             ) [1]
  return
  ($idiff,
   string-join(
                ((1 to count($baseSegments) - count($urlSegments)) ! '..',
                 $urlSegments[position() ge $idiff])
               , '/')
             )

it produces the wanted result:

../file2.txt

If we have this to contain:

let $pUrl := "file:/folder1/file2.txt",
    $pBase := "file:/folder1/folder2/folder3/file1.txt",

then again the wanted result is produced:

../../file2.txt

For me the confusion stems from the name of the function and I am not sure what "relativize" actually means.

As for why file:/ is used instead of file:// - again don't ask me. This is how the original problem in SO was formulated.

I would be glad to work on solving a correctly-formulated problem, if anyone could do this.

namedgraph commented 1 year ago

@dnovatchev XPath F&O refers to RFC 3986 for URIs and RFC 3987 for IRIs. There's a section on Relative Resolution.

dnovatchev commented 1 year ago

@dnovatchev XPath F&O refers to RFC 3986 for URIs and RFC 3987 for IRIs. There's a section on Relative Resolution.

@namedgraph This is not how we define functions in the F & O Specification.

The reader must be presented with a concise and fully understandable description of the function, its goal and semantics and we want to give them at least some examples that would demonstrate what the function is expected to do.

Not to send them to some other Spec, which they may or (more certainly) will not read.

At least from the standpoint of my limited understanding in this situation, the SO OP may probably not have meant the same thing that you want here, and I very honestly do not understand what the addressed problem is and what is the goal and semantics of the proposed function and what it is intended to do.

If this issue is presenting for voting in its current form, I definitely will vote against including it in the F & O functions in its current form due to the above considerations.

namedgraph commented 1 year ago

@dnovatchev resolve-uri operates on URIs in a similar way, so I'm assuming this function could be defined along the same lines.

dnovatchev commented 1 year ago

@dnovatchev resolve-uri operates on URIs in a similar way, so I'm assuming this function could be defined along the same lines.

@namedgraph Then please, define it.

michaelhkay commented 1 year ago

Let's try to be constructive.

As far as I can see, the Java specification of URI.relativize() is a perfectly workable baseline:

public [URI] relativize([URI] uri)

Relativizes the given URI against this URI. The relativization of the given URI against this URI is computed as follows:

If either this URI or the given URI are opaque, or if the scheme and authority components of the two URIs are not identical, or if the path of this URI is not a prefix of the path of the given URI, then the given URI is returned.

Otherwise a new relative hierarchical URI is constructed with query and fragment components taken from the given URI and with a path component computed by removing this URI's path from the beginning of the given URI's path.

We can have discussions about the detail (for example, should the URIs first be normalized?), but there's clearly a spec here that could be developed for a function that would be useful to some people.

My first criticism of the Java spec is that it doesn't say what "relativizing" a URI actually is for. My understanding is that the result of relativizing a URI H against a base URI B is to deliver a relative URI reference R such that resolve-uri(R, B) returns H, or at least something equivalent to H.

User feedback on the Java function suggests it's not quite flexible enough, for example one might expect that relativizing http://example.com/xx/index.html against http://example.com/xx/content.html would produce index.html, but that doesn't appear to be the case.

Note that for functions like this (resolve-uri is another example) it's best not to give a reference implementation, because it's very likely to be wrong in edge cases. It's much better to state the intent, to provide constraints on the result (postconditions) and to give a good range of examples.

Another implementation one could look to for inspiration is at https://github.com/thephpleague/uri-manipulations/blob/master/src/Modifiers/Relativize.php . There doesn't appear to be any attempt at a specification of what this code actually does.

dnovatchev commented 1 year ago

Another implementation one could look to for inspiration is at https://github.com/thephpleague/uri-manipulations/blob/master/src/Modifiers/Relativize.php . There doesn't appear to be any attempt at a specification of what this code actually does.

@michaelhkay Thank you for trying to explain this. It is helpful.

Maybe we need to establish a new namespace then: "http://www.w3.org/2005/xpath-functions/what-they-do/surprise"

I would greatly enjoy knowing what other people think about this.

namedgraph commented 1 year ago

Does this help?

Relativization, finally, is the inverse of resolution: For any two normalized URIs u and v, u.relativize(u.resolve(v)).equals(v) and u.resolve(u.relativize(v)).equals(v) . This operation is often useful when constructing a document containing URIs that must be made relative to the base URI of the document wherever possible. For example, relativizing the URI http://java.sun.com/j2se/1.3/docs/guide/index.html against the base URI http://java.sun.com/j2se/1.3 yields the relative URI docs/guide/index.html.

ChristianGruen commented 1 year ago

Maybe we need to establish a new namespace then: "http://www.w3.org/2005/xpath-functions/what-they-do/surprise"

I would greatly enjoy knowing what other people think about this.

I believe a function for relativizing URIs would be helpful.

Someone certainly needs to invest time to formalize and finalize the gory or glorious details. That applies to all other proposals in this repository as well, though, and we should have a separate look at a) the general usefulness of a proposal and b) the state of progress.

dnovatchev commented 1 year ago

Does this help?

Relativization, finally, is the inverse of resolution: For any two normalized URIs u and v, u.relativize(u.resolve(v)).equals(v) and u.resolve(u.relativize(v)).equals(v) . This operation is often useful when constructing a document containing URIs that must be made relative to the base URI of the document wherever possible. For example, relativizing the URI http://java.sun.com/j2se/1.3/docs/guide/index.html against the base URI http://java.sun.com/j2se/1.3 yields the relative URI docs/guide/index.html.

We could discuss when we have the full proposal specified, as any other proposals being discussed in this CG.

namedgraph commented 1 year ago

OK I realize know that with Java's URI::relativize you wouldn't be able to create paths that start with .. because one URI has to be a prefix of the other:

if the path of this URI is not a prefix of the path of the given URI, then the given URI is returned.

michaelhkay commented 1 year ago

This equivalence

For any two normalized URIs u and v,

u.relativize(u.resolve(v)).equals(v)  and
u.resolve(u.relativize(v)).equals(v)  .

is a useful property, but it doesn't fully specify the function. For example, it can be achieved trivially by having B.relativize(U) return U.

A more useful specification might be that given absolute URIs R and B, fn:relativize(R, B) returns the shortest possible string S such that fn:resolve-uri(S, B) returns R. But I suspect that formulation still needs further work.

michaelhkay commented 1 year ago

On .NET the method is Uri.MakeRelativeUri, and the specification says:

If the hostname and scheme of this URI instance and uri are the same, then this method returns a relative [Uri]that, when appended to the current URI instance, yields uri. If the hostname or scheme is different, then this method returns a [Uri] that represents the uri parameter.