Open karlcow opened 13 years ago
One important part of a URI persistence (persistency) policy should be about the use of the 410 Gone HTTP header.
Assuming that the individual(s) (IA, developer, team) in charge of the URI space have access at all to the 410 Gone, there does not seem to be any agreement - at least according to web discussions I have witnessed - on whether the 410 header should be used, and if so, for how long?
This raises the question of how the 410 (and any other redirect, etc) are managed.
As anecdote, my personal site has a set of redirection directives in its configuration file which have been active for almost 10 years - arguably longer than any cache or old page may have linked to or stored the resources in their "old" URI, but I can keep the redirects available indefinitely because of the easy persistence of the configuration file (flat file, as opposed to other, more transient, storage).
Another major part of the URI persistence policy concerns redirects. Questions similar to the ones in comment above apply:
A side question: once a resource has been given a URI and that URI advertised, other resources or applications on the web may be referencing that URI. Should there be a differentiation between links to a given URI from the open web (thus, in theory crawlable) and links from intranets and other walled gardens? Should a link from the non-open web be given the same importance as links from the open web?
Stolen from the BBC's Nature Site Team, their development manifesto contains
A URI persistence policy is a statement of trust for the user of your Web site as much as an engagement for designing your infrastructure. It creates a design constraint that will help thinking about the value of each URI you create and how you manage its future. Think about the ruin your web site will eventually become.
An example of URI persistence policy can also be found on W3C Web site.