alps-io / spec

ALPS Specification documents.
59 stars 13 forks source link

Proposal: consider Internationalized Resource Identifier (IRI) #92

Closed filip26 closed 3 years ago

filip26 commented 3 years ago

IRIs are supported by all the modern browsers, it would be great to reflect the current state and consider to replace URIs with IRIs in the next version of the specification.

mamund commented 3 years ago

what's the advantage of IRIs over URIs/URLs?

also, would this be a breaking change or just an additional support?

i think it is important to note in the spec where string can be de-referenced (URL) and where strings are for identification only (URI). this greatly eases parsing at runtime for machines. how does IRI fit into this?

maybe a couple examples?

filip26 commented 3 years ago

Internationalized Resource Identifiers (IRI) are complementary to URIs:

"A mapping from IRIs to URIs is defined, which means that IRIs can be used 
instead of URIs,  where appropriate, to identify resources.  
The approach of defining a new protocol element was  chosen 
instead of extending or changing the definition of URIs."  [1]

and

"IRIs are meant to replace URIs in identifying resources for protocols, 
formats, and software components that use a UCS-based character 
repertoire." [2]

Allowing using IRIs instead of URIs does not prevent existing ALPS users to continue using URI but it allows another users using 'URIs with non ASCII characters'

Please note that IRI does not require scheme as RFC defines IRI-reference. [3]

References

  1. RFC-3987 - Abstract
  2. RFC-3987 - Relationship between IRIs and URIs
  3. RFC-3987 - ABNF for IRI References and IRIs
filip26 commented 3 years ago

add URL and dereferencing: I would let it on a client to decide what to process and what not, following the rule 'ignore what you do not understand / what you cannot process'

all examples below should be valid:

<descriptor href="http://ヒキワリ.ナットウ.ニホン"/>
<descriptor href="mailto:user@example.com"/>
<descriptor href="ftp://user@example.com/file.tgz"/>
<descriptor href="urn:isbn:0451450523"/>
mamund commented 3 years ago

@filip26 @koriym

sounds fine. would either of you like to write this up as a pull request for the spec? i'm not quite up to speed on the IRI details and would be happy if someone else was able to write the mod for the spec. even a draft that I can use as a guide in the editing would be helpful.

thanks.

filip26 commented 3 years ago

here is an example how JSON-LD defines IRI usage. Replace JSON-LD node with ALPS element.

3.2. IRIs

mamund commented 3 years ago

@filip26

i note this from the JSON-LD document on IRIs...

IRIs can often be confused with URLs (Uniform Resource Locators), the primary distinction is that a URL locates a resource on the web, an IRI identifies a resource. While it is a good practice for resource identifiers to be dereferenceable, sometimes this is not practical.

This makes me think that IRIs are meant as identifiers and not as locators, right? esp. since IRIs support non-locator schemes ('urn' for one), it seems they are not appropriate for all cases.

what do you think?

filip26 commented 3 years ago

yes, IRIs are identifiers meant to replace URIs. URLs are a subset of URIs and URIs are a subset of IRIs.

IRIs/URIs are meant to denote things, and therefore great to express semantic information.

e.g.

<descriptor id="fname" href="https://schema.org/familyName"/>

but could be confusing when you just want to pass a location where to get more

e.g.

<descriptor type="safe" href="/inherited-alps-document#descriptor"/>

I'm thinking that this could be solvable by making a distinction between semantic definition (IRI/URI) and 'a link to somewhere' (URL).

e.g.

<descriptor id="fname" def="https://schema.org/familyName"/> 

<descriptor type="safe" href="something.alps"/>

where def: IRI and href: URL

mamund commented 3 years ago

yes. this makes total sense to me URLs for 'href' and IRIs for 'def'.

this also loops back to @koriym's comments about 'def'

i need to work up a portion of text for 'def' and that's when the IRIs will come in.

thanks.

filip26 commented 3 years ago

an example: Web of Things defines IDs as URIs and examples use URN to identify a thing.

Identifier of the Thing in form of a URI [RFC3986] (e.g., stable URI, temporary and mutable URI, URI with local IP address, URN, etc.).

Thing Description

{
   "id": "urn:dev:ops:32473-WoTLamp-1234",
   "title": "MyLampThing"
}