karlcow / webarch

How to explain in simple terms Web Architecture and HTTP for UX, marketing, front-end developers, etc.
17 stars 1 forks source link

What is a URI Persistence Policy? #5

Open karlcow opened 13 years ago

karlcow commented 13 years ago

Stolen from the BBC's Nature Site Team, their development manifesto contains

Persistence — only mint a new URIs if one doesn’t already exist: once minted, never delete it

A URI persistence policy is a statement of trust for the user of your Web site as much as an engagement for designing your infrastructure. It creates a design constraint that will help thinking about the value of each URI you create and how you manage its future. Think about the ruin your web site will eventually become.

An example of URI persistence policy can also be found on W3C Web site.

olivierthereaux commented 13 years ago

One important part of a URI persistence (persistency) policy should be about the use of the 410 Gone HTTP header.

Assuming that the individual(s) (IA, developer, team) in charge of the URI space have access at all to the 410 Gone, there does not seem to be any agreement - at least according to web discussions I have witnessed - on whether the 410 header should be used, and if so, for how long?

This raises the question of how the 410 (and any other redirect, etc) are managed.

As anecdote, my personal site has a set of redirection directives in its configuration file which have been active for almost 10 years - arguably longer than any cache or old page may have linked to or stored the resources in their "old" URI, but I can keep the redirects available indefinitely because of the easy persistence of the configuration file (flat file, as opposed to other, more transient, storage).

olivierthereaux commented 13 years ago

Another major part of the URI persistence policy concerns redirects. Questions similar to the ones in comment above apply:

A side question: once a resource has been given a URI and that URI advertised, other resources or applications on the web may be referencing that URI. Should there be a differentiation between links to a given URI from the open web (thus, in theory crawlable) and links from intranets and other walled gardens? Should a link from the non-open web be given the same importance as links from the open web?