thoth-pub / thoth

Metadata management and dissemination system for Open Access books
https://thoth.pub
Apache License 2.0
44 stars 8 forks source link

Obtain KBART endorsement and test subscription workflow #284

Open rhigman opened 2 years ago

rhigman commented 2 years ago

Following the creation of a first-pass KBART output under #88, this card tracks the following recommendations which were originally part of #88:

Recommendation 49. COPIM must review the resources made available on KBART and seek to seek endorsement of its files.

Recommendation 50. Once KBART files are available, COPIM should test with partner institutions that it is available as a package to ‘subscribe’ to in library discovery systems.

See #88 for further discussion/background on the implementation of the KBART output, including details of feedback from platform representatives.

rhigman commented 2 years ago

The KBART (NISO) Standing Committee have now responded to my additional queries as detailed in https://github.com/thoth-pub/thoth/issues/88#issuecomment-911647865. I asked the following:

  1. whether it would actually be possible for Thoth as a whole to have its files KBART-endorsed, given that we currently only do basic data validity checks and can't guard against user-entry inaccuracies from individual publishers (e.g. typos in titles);
  2. whether it would be acceptable to use DOIs in the title_id field, given that we are trying to be generic across a number of different publishers, who might structure their title URL links in different ways.

They responded:

  1. Because responsibility for the metadata ultimately lies with content providers, we will continue to endorse content providers rather than metadata management systems. We are happy to advise on specific questions like those you have raised without formally considering endorsement. As we revise the recommended practice, we will consider whether Thoth would be an appropriate addition to the registry (which is separate from the listed of endorsed content providers) or if it might be better suited to a separate list, perhaps a new one dedicated to tools for creating or converting metadata according to KBART criteria.

  2. _The title_id field should “[g]ive the proprietary identifier for the content title, if you use a title identifier to create links to content.” If a content provider does not include any kind of title-level identifier within the title URLs, we would advise them to rethink their link structure rather than create a workaround to accommodate our metadata fields. One problem with defaulting to a full DOI URL as a title_id is that multiple content providers can host the item (e.g., the original publisher as well as an aggregator), but the doi.org link will resolve only to one of those. So we do expect to continue requiring a platform-specific URL as the title_url and the title-specific component of that URL also listed as the title_id. If the DOI is used, it should be as a part of the URL path, and only that portion should be given in titleid rather than the full doi.org URL.

From discussions with Jisc, OCLC and EBSCO KnowledgeBase, none of them seem to have an issue with accepting the current format, which uses DOIs in the title_id field. (The version sent to the committee included the full DOI URL with the https://doi.org/ prefix, but the current version omits this prefix.) The forthcoming new OBP website will also use DOIs as the title-level identifier within its title URLs, so at this point, its Thoth files will become compliant with the NISO recommended practice as quoted by the committee. Presumably this will also be the case for any other publishers which start using the same white-label website. So we could consider this an acceptable workaround, or we could start requesting title identifiers as a specific field within Thoth (or we could plead the case some more).

rhigman commented 2 years ago

KBART are currently revising their recommended practice, and may be able to add Thoth to their Registry (separate from their list of endorsed content providers) in some capacity after this.

Separately, they would be happy to progress endorsement of OBP as an individual content provider if we re-submit a sample file compliant with their specifications. We can (only) do this once the new OBP website is live.

To recap, the sticking point is that we continue to output DOIs in the title_id field, rather than outputting the product's individual link 'slug' (title-level identifier). This is because we don't store the link slug as standalone data in Thoth (we might be able to programmatically extract it from the full title URL, but this could be unreliable). Coincidentally, the new OBP website's link slugs will simply be the product's DOI. KBART does not specifically mandate use of DOIs as product link slugs.

Any other publisher using Thoth whose link slugs match product DOIs should theoretically also be able to pursue endorsement using a Thoth-generated KBART sample file, as long as their user-entered data is clean and accurate.

rhigman commented 3 months ago

Further update from KBART on 2023-06-09:

Discussion of potential changes to our endorsement practices and the registry are still slated for later in the process of the Phase III revision, but I am inclined to say the most likely outcome is that we will continue to endorse only content providers, no matter how their metadata is processed behind the scenes. It is in the realm of possibility that the registry could include and potentially highlight other types of interested parties, but that too would come after finalizing the new revision of the recommended practice.

At time of writing, Phase III still doesn't appear to be completed yet (no updates on https://www.niso.org/standards-committees/kbart).

Individual OBP endorsement is progressing further now that the new website has been implemented (with the match between DOI and link slugs). Note that this website is the basis of the Thoth white-label publisher website, so any Thoth publisher using the white-label website should theoretically also be able to receive KBART endorsement in due course. Publishers with different website structures are likely to find that the Thoth KBART isn't formatted appropriately for endorsement.

Outstanding issues with OBP endorsement (section numbers refer to Phase II recommended practice) (see also #88, where these issues were originally outlined, along with others which have now been resolved):

tosteiner commented 3 months ago