ycba-cia / blacklight-collections2

5 stars 2 forks source link

RBM image download captions #330

Closed flapka closed 2 years ago

flapka commented 3 years ago

Following a discussion with Imaging and RBM colleagues, we'd like to propose an adjustment to the form of (long) captions provided when users download images from our Blacklight interface. As with the art collections, the caption will be based on a concatenation of fields in MARC/Blacklight, but not necessarily the same fields.

Here are three examples of desired outcomes:

Helmingham herbal and bestiary : https://collections.britishart.yale.edu/catalog/orbis:9452785Caption: Helmingham herbal and bestiary. Helmingham, Suffolk, circa 1500. Yale Center for British Art, Paul Mellon Collection.

London map : https://collections.britishart.yale.edu/catalog/orbis:3589098Caption: Agas, Ralph, 1545-1621. London and Westminster in the reign of Queen Elizabeth anno dom. 1563. London : Published ... by J. Wallis No. 16, Ludgate Street, October 30th, 1789. Yale Center for British Art, Paul Mellon Collection.

Wild flowers worth notice : https://collections.britishart.yale.edu/catalog/orbis:9294296Caption: Lankester, Phebe, 1825-1900. Wild flowers worth notice : being a selection from the British flora of some of our native plants, which are most attractive from their beauty, uses, or associations. London : R. Hardwicke, 1861. Yale Center for British Art, Paul Mellon Fund.

If there is agreement, I will follow-up with an exact specification of the concatenation rules (with Blacklight fields).

flapka commented 2 years ago

To implement the desired outcomes described above, concatenate as follows:

  1. author_ss a. if the final character is not already a period, add a period
  2. title_short_ss a. terminal punctuation conditions: i. “ :” – replace with “.” ii. “ /” – replace with “.” iii. “.” – leave as is
  3. edition_ss a. if the final character is not already a period, add a period
  4. publisher_ss a. if the final character is not already a period, add a period
  5. credit_line_ss a. if the final character is not already a period, add a period
flapka commented 2 years ago

Review of records on work for the above reminds me of the following desired change to the marc-preblacklight-ycba xslt:

In the "publishDate" template, we should add the a third xsl:for-each:

        <xsl:for-each select="marc:datafield[@tag='245']/marc:subfield[@code='f']">
            <xsl:element name="publishDate_ss">
                <xsl:value-of select="normalize-space(.)"/>
            </xsl:element>
        </xsl:for-each>

This is needed to handle MARC records for archival collections, in which the date appears at the end of the 245 field instead of a 260 or 264.

Here's an example record for testing: https://collections.britishart.yale.edu/catalog/orbis:14048315

yulgit1 commented 2 years ago

@flapka for https://collections.britishart.yale.edu/catalog/orbis:9294296 with title_short_ss the result is:

Lankester, Phebe, 1825-1900. Wild flowers worth notice . London : R. Hardwicke, 1861. Yale Center for British Art, Paul Mellon Fund.

Rather than what is stated in example above:

Lankester, Phebe, 1825-1900. Wild flowers worth notice : being a selection from the British flora of some of our native plants, which are most attractive from their beauty, uses, or associations. London : R. Hardwicke, 1861. Yale Center for British Art, Paul Mellon Fund.

Is with title_short_ss correct?

flapka commented 2 years ago

@yulgit1 Yes, good catch. The issue here is that we have Blacklight fields that correspond to MARC 245a (title proper) and fields that correspond to MARC 245a+b+c (title proper + other title information + statement of responsibility), but no field that corresponds to MARC 245a+b -- which is what we need to fulfill the outcome preferred in the Lankester example.

@edgartdata We have a handful of title fields, and I can't recall how all of them are defined:

Would one of these be appropriate for the title proper + other title information mapping? Perhaps title_primary? Where does title_primary display in Blacklight?

yulgit1 commented 2 years ago

in case you don't know already, with .json at end of URL you can see the object json. https://collections.britishart.yale.edu/catalog/orbis:9294296.json

@flapka If my eyes are good that means title, title_primary, and title_full are the same. And these fields aren't manipulated in camel_collections post index processing, and title_primary isn't used in blacklight. So should I go ahead and make title_primary 245 a+b and use that for caption concat?

yulgit1 commented 2 years ago

@flapka - went ahead and just tried 245+a+b, this object 14048315, happens to end with a comma, should that get replaced with a period?

<marc:datafield tag="245" ind1="1" ind2="0">
<marc:subfield code="a">Yale Center for British Art printed materials,</marc:subfield>
<marc:subfield code="f">1965-2018.</marc:subfield>
</marc:datafield>
flapka commented 2 years ago

@yulgit1 If we can simply combine 245a + 245b for this purpose without defining a new SOLR field, that'd be fine. To keep it simple, the concatenated result should always end in a period, replacing other punctuation if present (comma, colon, semi-colon, or forward slash ('/').

For https://collections.britishart.yale.edu/catalog/orbis:14048315 (an archival collection), the concatenation would yield:

Yale Center for British Art. Institutional Archives. Yale Center for British Art printed materials. 1965-2018.

yulgit1 commented 2 years ago

@flapka There is no 245 b field to concatenate. So I used titles_primary_ss for the 245 a+b.

But for the archival collection https://collections.britishart.yale.edu/catalog/orbis:14048315, there is no image to download, hence no caption. Is that just an inapplicable example?

flapka commented 2 years ago

@yulgit1 Sorry, I was crossing two separate issues in my response:

  1. MARC records for archives won't have 245 $b, but the $a + $b concatenation will be useful for image download captions for non-archival material.
  2. orbis:14048315 of course doesn't have images, sorry! Please use this example instead -- after re-indexing overnight (i.e. on Tuesday): https://collections.britishart.yale.edu/catalog/orbis:3195945

For 14048315, the mapping should yield this download caption: Cabaret Theatre Club (London, England). Collection of Cabaret Theatre Club and Cave of the Golden Calf printed ephemera. 1912-1914. Yale Center for British Art, Friends of British Art Fund

yulgit1 commented 2 years ago

@flapka So now what I have for 14048315, is missing the 1912-1914. That part is in $f not $b. Did you mean a concat of $a+$f, not $a+$b?

<marc:datafield tag="245" ind1="1" ind2="0">
<marc:subfield code="a">
Collection of Cabaret Theatre Club and Cave of the Golden Calf printed ephemera,
</marc:subfield>
<marc:subfield code="f">1912-1914.</marc:subfield>
</marc:datafield>
flapka commented 2 years ago

@yulgit1 Ah your suggestion shows me that we probably want to define title_primary as 245$a + 245$b + 245$f. I think that would give the desired outcomes.

For my own benefit as much as anything, I'm summarizing the types of titles, and where RBM wants them to display:

  1. title_short = MARC 245a -- Display with:
    • search results masonry view
    • caption at header of iiif viewer
  2. title_primary = MARC 245a+b+f -- Display with:
    • image download caption
  3. title and title_full = MARC 245 most subfields -- Display with:
    • search results list view
    • full object record
    • title that appears with resource description in the iiif viewer

All of the above display preferences are already realized, with one exception: the full object record displays title_primary where we want the full title instead. See this record for example: https://collections.britishart.yale.edu/catalog/orbis:1273862

yulgit1 commented 2 years ago

OK, titles_primary_ss changed from: 245$a + 245$b to 245$a + 245$b + 245$f

indexing overnight

yulgit1 commented 2 years ago

This should be resolved (publishDate for 245$f and new concat of marc captions)

https://git.yale.edu/ermadmix/ycba_xslts/commit/37d860c531400b383db86bce1b33be4494d478eb https://git.yale.edu/ermadmix/ycba_xslts/commit/1a46ff913811d3d354b1a57e72c3a1ed951859f7 https://github.com/ycba-cia/blacklight-collections2/commit/cec366759a66c007f46466486b856a7f236da1a9

flapka commented 2 years ago

Great. To my eyes, the download captions look precisely as desired. I will confirm with RBM colleagues.

One additional fix: In object records from Voyager (for RBM, Ref, and IA alike), we want the title field to display title_full. Currently it displays title_primary I believe. (We want to include the bits that come after the '/')

Example: https://collections.britishart.yale.edu/catalog/orbis:3282140

yulgit1 commented 2 years ago

Sorry, this is now fixed, full title in object records.

https://collections.britishart.yale.edu/catalog/orbis:3282140

flapka commented 2 years ago

Perfect. Thanks!