ietf-tools / bibxml-service

Django-based Web service implementing IETF BibXML APIs
https://bib.ietf.org
BSD 3-Clause "New" or "Revised" License
16 stars 19 forks source link

Bibliographic data persistence in context of BibXML service use by authors & external tooling #83

Closed strogonoff closed 2 years ago

strogonoff commented 2 years ago

Question arose due to https://github.com/ietf-ribose/relaton-data-w3c/issues/2#issuecomment-1014002413.

With the old xml2rfc tools website, once someone places a file, it’s there forever. Now that the situation is more fluid and available bibliographic data is more extensive, is it acceptable when an entry vanishes? (Either forever in some circumstances, or temporary e.g. due to indexing conditions.)

Obviously it would be undesirable, the question is more whether it’ll outright break workflows/future tooling, or they won’t be expected to rely on a once-cited bibliographic item to remain available indefinitely under the same ID.

(This does not apply to legacy xml2rfc-style paths, which must return a response in any case.)

It’s not a pressing question, but can have implications on architecture.

cc @ronaldtse

ronaldtse commented 2 years ago

With the old xml2rfc tools website, once someone places a file, it’s there forever.

That's not a guarantee though. Similarly we can say the opposite -- someone deletes a file, it's gone forever. The xml2rfc tools website does not promise permenant persistence.

There should not be any obligation for certain data to be always available at a certain path. The legacy paths are only intended to facilitate the transition. After the transition it becomes a question of the data management process (or the "register management process" in words of ISO 19135).

Obviously it would be undesirable, the question is more whether it’ll outright break workflows/future tooling, or they won’t be expected to rely on a once-cited bibliographic item to remain available indefinitely under the same ID.

I feel that this expectation is a false premise. There is no guarantee when Google removes a search result, or when a library removes a book. Even in a PURL system, like DOI, an international standard, this occurs -- the identifier and content can both change, and disappear. A PURL system is also subject to data management practices.

In the same line of thought, tools should not expect an external file to remain present forever.

Also seeking for @rjsparks 's views.

rjsparks commented 2 years ago

@ronaldtse - You are touching a hot-topic for several major IETF contributors- "Cool URIs don't change" - https://www.w3.org/Provider/Style/URI.

Having a reference that once existed cease to exist should be an exceptional case - it should be of the nature "that was really broken, was spam, or had some other reason to actively remove it" rather than "that's old and it became inconvenient to keep it alive".

For references outside the RFC and Internet-Draft sources - if another SDO or organization changes it's naming format and doesn't provide backwards-compatibility, I don't think it becomes a requirement for us to continue to serve their old format for them. @TonyLHansen - check me on this.

The legacy things that will break (using the legacy URL endpoints) are the source files for older documents that used the include mechanic that ultimately relied on those URLs. Even as tooling is upgraded to use the API directly, there will be source files that explicitly used the legacy URLs in the ecosystem for a very long time.

I have an activity to complete with Tony that will probably uncover a set of hand-built bibxml-ids entries that I will need to add manually to the backstore, or create a separate backstore for them so that the derived set from the datatracker is well understood, which means in the long run we may have two sources to pull from for bibxml-ids on a rebuild. To be clear, this is conversation and speculation, not a requirement for the current project at this time. But please be sure it will be easy for us to add such a thing to the source in the future.

strogonoff commented 2 years ago

Couple of notes regarding Robert’s points: