sul-dlss / SearchWorks

SearchWorks (Stanford University Libraries)
http://searchworks.stanford.edu
Other
48 stars 10 forks source link

Don't display "Digital collection" link when collection is not released. #3840

Open dnoneill opened 9 months ago

dnoneill commented 9 months ago

broken links on biface digital collections

https://searchworks.stanford.edu/?search_field=search&q=biface links under the digital collection link are broken and do not work.

jcoyne commented 9 months ago

These links are only on the list view, so specifically the "Digital collection" you see if you go to https://searchworks.stanford.edu/?q=biface&search_field=search&view=list

jcoyne commented 9 months ago

The problem here is that the Collection has not been released to searchworks. https://argo.stanford.edu/view/dg570gb2904

andrewjbtw commented 9 months ago

I think this is a more general problem than this one collection. That collection link probably shows up because the MODS record for the items has this:

<relatedItem type="host">
      <titleInfo>
        <title>Stanford University Archaeology Collections</title>
      </titleInfo>
      <location>
        <url>https://purl.stanford.edu/dg570gb2904</url>
      </location>
      <typeOfResource collection="yes"/>

I think SW probably generates the "digital collection" link when it receives MODS where a collection is a related resource of type "host", regardless of whether the collection has a SW record. In this case, the purl exists but the collection is not "released" so there's no collection page in SW, so the link doesn't work.

Note that the context widget that would say "this item belongs to a collection" is not appearing on the item pages like https://searchworks.stanford.edu/view/qb122dq4313 . So SW in the individual item page context is detecting when not to link to a non-existent page.

andrewjbtw commented 9 months ago

An additional piece of context: we had a similar problem with broken digital collection links for items with MARC records. In that case, the collection link was being generated off of the data in the 856 and not from the data in the MODS. The approach we took was to not insert collection druids in the 856 if the collection is not released: https://github.com/sul-dlss/dor-services-app/issues/4297

I'm hesitant to recommend the analogous approach with the public XML MODS, which would be to not publish the collection as a related resource if the collection is not released. Unlike the 856, which is only used in the SW context, the public XML can be used in other contexts (and on the purl page itself), so removing the collection info there could have unintended effects.

dnoneill commented 9 months ago

I looked at the record. It looks like the url is malformed because the solr document has the url https://purl.stanford.edu/dg570gb2904. It doesn't look like there is anything at the URL but a placeholder?

"physical"=>["1 biface"], 
"url_suppl"=>["https://purl.stanford.edu/dg570gb2904"], 
"url_fulltext"=>["https://purl.stanford.edu/jc957dh9590"], 
"collection"=>["dg570gb2904"], 
"collection_with_title"=>["dg570gb2904-|-Stanford University Archaeology Collections"], 
andrewjbtw commented 9 months ago

That's the correct purl URL for the collection. There's a long backlog of description for SDR materials, and no further description has been provided for this object. But every record that is made public has a Purl, whether or not it has extensive description.

The way the SDR was designed, collections are represented as "collection objects" that consist only of metadata. The minimal metadata for a collection is a title and an identifier, which is about all this collection object has (plus the use and reproduction statement). The collection object is then used to link together items in that collection.

When both a collection object and the items in the collection are released to SearchWorks, you get both SW item pages and SW collection pages. But it's possible to release individual items to SW without releasing the collection object. In that case, you should only get SW item pages and no SW collection page (whether or not the collection Purl page exists). That's what happened here and I suspect the collection object was not released to SW because it has so little description.

hudajkhan commented 8 months ago

Suggestion to include more devs/PO in discussion. May also need to confirm the order of events being supported: when collection is released, when item is released, when these are indexed, and are we covering the possibilities. (Some of these different order combinations were tested on preview).

saseestone commented 8 months ago

During sprint planning, I slacked with @andrewjbtw on this. Here's a summary of that conversation:

He agrees that the requirement is to NOT display any collection information if the collection record is unreleased. He does not see value to indexing but not displaying, or displaying without hyperlinking. There are examples of collections records that would be problematic if indexed and displayed (e.g. "Acquisitions Serials")