uwlib-cams / MARC2RDA

mapping between MARC21 and RDA-RDF
Creative Commons Zero v1.0 Universal
32 stars 2 forks source link

245 title statement #115

Open CECSpecialistI opened 2 years ago

CECSpecialistI commented 2 years ago

https://github.com/uwlib-cams/MARC2RDA/blob/main/Working%20Documents/2XX.csv

CECSpecialistI commented 2 years ago

Began in-meeting group review on 2022-08-10

lake44me commented 2 years ago

FWIW: I tracked what I was looking for, the latest version of DACS (Describing Archives: A Content Standard, maintained by Society of American Archivists). Its crosswalks appendix, mapping among other things, DACS elements to MARC21. It's now hosted in GitHub: .

DACS EAD 2002 EAD3 (draft) MARC
2.4 Date <unitdate> | <unitdate> or <unitdatestructured> 264 _0 $c

No mapping to 245 $f or $g was found.

This is not to say that there aren't guidelines or practices for MARC mapping of archive descriptive elements elsewhere (perhaps with a less geographically limited focus) that govern where or where not to put the date ranges defined for those fields in MARC.

The DACs section 2.4 cited does appear to say that you would always provide inclusive dates even if you also want to provide bulk dates. Possibly the availability of two distinct subfields in 245 would be the advantage of using those subfields over 264 $c . It may also be a consideration for whether we can somehow make a distinction when we have both, and are mapping to RDA entities etc.

Archivists who create these records could give better perspective than I can - I just go by what I've seen and limited exposure in the past. It may be that an older practice was to use the 245 subfields, and that changed at some point to using the 264. Here's a link to one of our older records for an archive . If you click on "Staff view" you can see in the MARC that the 245 has $f with a date range and that neither 260 nor 264 is present in the record. Our newer records seem to be using the 264 $c . OCLC BFAS for 264 even has "Archival instructions" to "Enter the date of production of an archival collection as a year or range of years. If you are cataloging a single manuscript, you may include the month and day in that order following the year, if appropriate."

Since the presence of $f and (or) $g indicates that the manifestation is of a collection work (and thus an aggregating work), it seems to me that "has date of manifestation" doesn't feel appropriate. The date(s) relate to the included members that are aggregated, not to the collection manifestation itself (which may have been "created" or collected far later than the creation dates of its contents). But for mapping, it may not be worthwhile to try to pursue an alternative relation for 245 $f and/or $g if the same dates are now to be entered in 264, which we could not easily distinguish from other dates of creation.

Would the possibility of having both $g and $f mapped need to be accounted for? Would having two "has date of manifestation" statements for one manifestation violate anything?

lake44me commented 2 years ago

Uh oh Crystal - it looks like I accidentally clicked on something and closed the issue. Can we reopen it?

lake44me commented 2 years ago

I think I fixed that. Sorry.

CECSpecialistI commented 2 years ago

No worries!

From: Laura Akerman @.> Sent: Wednesday, August 17, 2022 2:30 PM To: uwlib-cams/MARC2RDA @.> Cc: Crystal E. Clements @.>; Assign @.> Subject: Re: [uwlib-cams/MARC2RDA] 245 title statement (Issue #115)

I think I fixed that. Sorry.

— Reply to this email directly, view it on GitHubhttps://github.com/uwlib-cams/MARC2RDA/issues/115#issuecomment-1218508192, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AKJWNZNYZGHUEICOLDDI4MLVZVKVJANCNFSM5IXKNZWA. You are receiving this because you were assigned.Message ID: @.**@.>>

szapoun commented 2 years ago

mapping of 245$h to RDA 336, 337, 338 Here are some old documents from the LoC (doc file) and the Mississippi State University (pdf file)

To further help, I have captured all the values for GMD (general material designation) and SMD (specific material designation) from an AACR printed copy. GMD_SMD.zip

GordonDunsire commented 2 years ago

Here are my notes for subfields we did not discuss/resolve last week:

$f and $g should map to date of manifestation (rdam:P30278). For a produced (manuscript) manifestation that is not a reproduction (scribe's copy), the date of work, expression, and manifestation are the same (instantaneous). For a manuscript reproduction, the date of work or expression cannot be reliably determined. Use of this subfield seems to be intended for archival materials; a letter is an original manuscript manifestation; a set of documents is a collection manifestation. The MARC 21 manual examples are not all collection manifestations. The materials described are assumed to be curated in archival 'fonds' or similar, which are equivalent to a hierarchy of collection/sub-collection/individual descriptions.

$k should not map to nature of content (rdaw:P10222). The example of "typescript" is not content, and the element is soft-deprecated. Map to note on manifestation?

$n and $p should not map to numbering of part. It assumes that $a is mapped to title of work but is not keyed to such a mapping. It assumes that the manifestation truly reflects the whole-part structure of the work/expression, and is blind to aggregates. It fails when there is more than one 'part'. These should map to an addition to title proper or title of manifestation ($a + $n + $p): "Advanced calculus. Student handbook" "Internationale Strassenkarte. Europe 1:2.5 Mio." "Dissertation abstracts. A, The humanities and social sciences"

$s may be part of title proper, etc. Usage?

lake44me commented 2 years ago

Note that $n and $p are repeatable - and I think they could appear in either order (part first, then number, or other way around) though I can't find an example where a part name comes first. They should be added after $a in the order they appear in the 245. Otherwise, it could get confusing.

OCLC example:

245 | 0 | 0 | Comptes rendus hebdomadaires des séances de l'Académie des sciences. ǂn A, ǂp Sciences mathématiques. ǂn B, ǂp Sciences physiques

gerontakos commented 1 year ago

Is it true, as our 245 spreadsheet asserts, that we never use the 245 for titles of W-E-I? We only use the 245 for Manifestation titles? (I'm starting to code it and am wondering about the weirdness of, for example, description sets of Works that do not include a title.)

GordonDunsire commented 1 year ago

We are only certain (relatively) that 245 pertains to the manifestation that is described.

While it is true that, say, a title proper in Croatian "Medo Winnie zvani Pooh" indicates a translation (of "Winnie-the-Pooh") and might be a title of expression, the inference cannot be easily parsed out of the data.

It is a moot point, raised in several jane-athons using RIMMF3, that a title of manifestation should be treated as a variant title of work. This can be broadened to title of work, but it is not safe to map to preferred title of work. Care is needed in determining which work is being assigned the title. We have tested a couple of cases of popular single-expression manifestations; the result is a long list of translated and minor variant title(s) of work that may, or may not, be useful. In the case of an aggregate, it is the aggregating work by default. A special case is a parallel aggregate with "parallel" titles, each of which is a title of manifestation and a title of work of the aggregated work. It is not useful generally to mint an aggregating work for a parallel aggregate and assign the same(?) title as the aggregated work.

A work without a title must have an access point or identifier (which can be a stringified IRI) to conform with well-formed RDA metadata. Typically, the base of an access point for work is an authorized access point for an agent who creates the work, plus a normalized preferred title of work or title of work derived from the manifestation ... back to the 245 :-)

gerontakos commented 1 year ago

Questions that arose while coding the 245: (1) Were "parallel" titles taken into consideration in the spreadsheet? Do we just retain the equal signs? [That's how it is at present.] (2) Spreadsheet says what to do when ldr/18 = a or i or c, but what about when it is blank, n or u? Coding accounted for #, n, u. Do note that the transform will fail when the leader is not valid. (3) When LDR/18=c, do we only output the values of subfields $a, $n, $p and $s as the value of hasTitleOfManifestation, or do we output the entire field, including, for example, $b and $c? [We assumed we should output just the selected subfields, not all subfields. Terminal colon and slash were removed from the value of $a.] (4) What do we do with something like the following, where repeated $n and $p are associated with $b? [We stated $a must be present, but if a preceding siblling of $n or $p or $s was $b, then suppress from output.] 245 00$aAnnual report of the Minister of Supply and Service Canada under the Corporations and Labour Unions Returns Act.$nPart II,$pLabour unions =$bRapport annuel du ministre des Approvisionnements et services Canada présenté sous l'empire et des syndicates ouvriers.$nPartie II,$pSyndicats ouvriers. (5) When there is a $a without either $n $p or $s, do we still output $a as the value of hasTitleOfManifestation? [We chose yes and output even if there is only $a as well as outputting $a value to the value of hasTitleProper.] (6) Why is there a row for the treatment of 245-0-[0-9]-h and 245-1-[0-9]-h? [We handled both rows with the same code; they appear to be equivalent.] (7) The following note appears twice in the 245 spreadsheet: "See also TAG 007 , 336, 337 and 338." What does it mean?

GordonDunsire commented 1 year ago

Tentative answers to some of these questions:

(1) RDA does not recognize parallel titles, and treats them as separate titles. The equals sign precedes a parallel title coded as $b, but also precedes a parallel statement of responsibility ($c) and other parallel elements; if $b is preceded by an equals sign, it is safe to assume that $b records a parallel title, which should be mapped as a distinct title of manifestation. Otherwise, I don't see any way of parsing out other parallel data. The equal sign is not part of the title, and should not be retained.

(3) We agreed that only those subfields contain title data; $b contains 'other title information' which does not exclusively mean 'title information' or 'other title', and $c is statement of responsibility and not title. (4) $n/$p may be repeated for two reasons. The first is that it is parallel title data, indicated by an equals sign that precedes $b; the second is that there are two levels of part/section hierarchy, but there will be no preceding equals sign. In the first case, there are two values of title of manifestation: $a + $n + $p; $b + $n + $p. In the second case, there is only one title: $a + $n + $p + $n + $p. The example is the first case and parses as:

title of manifestation: "Annual report of the Minister of Supply and Service Canada under the Corporations and Labour Unions Returns Act. Part II, Labour unions" title of manifestation: "Rapport annuel du ministre des Approvisionnements et services Canada présenté sous l'empire et des syndicates ouvriers. Partie II, Syndicats ouvriers"

(5) Yes.

tmqdeborah commented 1 year ago

I see from a previous comment (https://github.com/uwlib-cams/MARC2RDA/issues/115#issuecomment-1225320949) that there was a question about whether 245$s (Version) might be part of title proper, but I don’t see a reply. Then I see that the Google spreadsheet for 245 is mapping $s as part a value of title of manifestation.

Searching my copy of 5.6 million LC MARC records, I found 39 records where a 245 field contained subfield $s. The sample results are attached here: 245$s.txt

Based on that sample, and the rather confusing definition of that subfield element in the MARC manual (“Name, code, or description of a copy of the described materials that was generated at different times or for different audiences. “), and the fact that there is an RDA element for Expression: designation of version, in the rare case where a $s is present in a 245, shouldn’t its value be mapped to Expression: designation of version?

cspayne commented 1 month ago

@GordonDunsire @tmqdeborah @CECSpecialistI

Can I request a review of the test output for 245 when you have time? I updated the code so that when a and n, p, and/or s are present, 245 maps to title of manifestation. If only a is present, then it maps to title proper. I did not make changes to when $b contains '=', which causes the text on each side of the = sign to map to "has title of manifestation". Is this accurate? There appear to be details in the code that Theo wrote that are not in the mapping sheet, so I am unsure what needs to be updated and what is correct.

tmqdeborah commented 1 month ago

@GordonDunsire @tmqdeborah @CECSpecialistI

Can I request a review of the test output for 245 when you have time? I updated the code so that when a and n, p, and/or s are present, 245 maps to title of manifestation. If only a is present, then it maps to title proper.

A Title proper is simply a Title of manifestation that is preferred by a community. In the case of MARC records, it is the Title of manifestation that is preferred by the library that created/used a MARC record. Subfields $n and $p must be retained as part of the title because it was the cataloger's decision that the title of a part manifestation or work needed both the title of the larger manifestation or work (in $a) and the specific number or title of the part manifestation or work that is being described, in $n and/or $p.

245 $a alone maps as Title proper 245 $a$n$p maps as Title proper 245 $s [if we decide it is indeed a part of a title, i.e., in the title area; similar to a :$b but meant to be retained as part of a title] maps as Title proper

tmqdeborah commented 1 month ago

Quoted from https://github.com/uwlib-cams/MARC2RDA/issues/115#issuecomment-2293910802 "I did not make changes to when $b contains '=', which causes the text on each side of the = sign to map to "has title of manifestation". Is this accurate? There appear to be details in the code that Theo wrote that are not in the mapping sheet, so I am unsure what needs to be updated and what is correct."

When $b contains ' = ' it should only be mapped when ' = $b' is also mapped, before it. For example: 245 00$aAnnual report of the Minister of Supply and Service Canada under the Corporations and Labour Unions Returns Act.$nPart II,$pLabour unions =$bRapport annuel du ministre des Approvisionnements et services Canada présenté sous l'empire et des syndicates ouvriers.$nPartie II,$pSyndicats ouvriers.

If we are mapping $b containing parallel (=$b) or subsequent titles (;$b), then this would map as Title proper: Annual report of the Minister of Supply and Service Canada under the Corporations and Labour Unions Returns Act. Part II, Labour unions = Rapport annuel du ministre des Approvisionnements et services Canada présenté sous l'empire et des syndicates ouvriers. Partie II, Syndicats ouvriers.

Otherwise (if we are not going to include parallel or subsequent titles as part of a collective title) then omit the entire =$b from the title.

AdamSchiff commented 1 month ago

Rapport annuel du ministre des Approvisionnements et services Canada présenté sous l'empire et des syndicates ouvriers. Partie II, Syndicats ouvriers is not the title proper. It is the parallel title proper.

Adam

Adam L. Schiff Principal Cataloger University of Washington Libraries (206) 543-8409 @.***


From: Deborah Fritz @.> Sent: Friday, August 16, 2024 11:44 AM To: uwlib-cams/MARC2RDA @.> Cc: Adam L Schiff @.>; Assign @.> Subject: Re: [uwlib-cams/MARC2RDA] 245 title statement (Issue #115)

Quoted from #115 (comment)https://urldefense.com/v3/__https://github.com/uwlib-cams/MARC2RDA/issues/115*issuecomment-2293910802__;Iw!!K-Hz7m0Vt54!mBBFKAgJmpBeiqwsg9NznZfFO1segYwR8bmz1ROfTUyASCnDTp8Ds9_O5yQZCMd6N1uYAKI03dN2RZDaYMSxrgg$ "I did not make changes to when $b contains '=', which causes the text on each side of the = sign to map to "has title of manifestation". Is this accurate? There appear to be details in the code that Theo wrote that are not in the mapping sheet, so I am unsure what needs to be updated and what is correct."

When $b contains ' = ' it should only be mapped when ' = $b' is also mapped, before it. For example: 245 00$aAnnual report of the Minister of Supply and Service Canada under the Corporations and Labour Unions Returns Act.$nPart II,$pLabour unions =$bRapport annuel du ministre des Approvisionnements et services Canada présenté sous l'empire et des syndicates ouvriers.$nPartie II,$pSyndicats ouvriers.

If we are mapping $b containing parallel (=$b) or subsequent titles (;$b), then this would map as Title proper: Annual report of the Minister of Supply and Service Canada under the Corporations and Labour Unions Returns Act. Part II, Labour unions = Rapport annuel du ministre des Approvisionnements et services Canada présenté sous l'empire et des syndicates ouvriers. Partie II, Syndicats ouvriers.

Otherwise (if we are not going to include parallel or subsequent titles as part of a collective title) then omit the entire =$b from the title.

— Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https://github.com/uwlib-cams/MARC2RDA/issues/115*issuecomment-2293988376__;Iw!!K-Hz7m0Vt54!mBBFKAgJmpBeiqwsg9NznZfFO1segYwR8bmz1ROfTUyASCnDTp8Ds9_O5yQZCMd6N1uYAKI03dN2RZDa2hCufeM$, or unsubscribehttps://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/ADFBVB2K6GMUMWSDD5JLSVTZRZCCJAVCNFSM6AAAAABMUOUD5GVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEOJTHE4DQMZXGY__;!!K-Hz7m0Vt54!mBBFKAgJmpBeiqwsg9NznZfFO1segYwR8bmz1ROfTUyASCnDTp8Ds9_O5yQZCMd6N1uYAKI03dN2RZDajrCDrCs$. You are receiving this because you were assigned.Message ID: @.***>

GordonDunsire commented 4 weeks ago

@AdamSchiff: the parallel elements are soft-deprecated in RDA, so we have agreed not to use them in the transform. RDA allows more than one title proper. In the earlier part of this discussion I suggested that both subfield $a and = preceding subfield $b could map to title of manifestation and leave it to subsequent cataloguers to choose which one to be title proper for whatever local purpose. However, I recall a verbal discussion where we agreed to assume that the choice had already been made, in the context of ISBD and original RDA, by the creator of the MARC 21 record; that is, subfield $a should map to title proper, and = $b should map to title of manifestation. This has been applied to the example:

<fake:marcfield>F245 03 $a Le Bureau $h [filmstrip] = $b La Oficina = Das Büro.</fake:marcfield>
      <rdamd:P30156>Le Bureau</rdamd:P30156><!--rdamd:P30156 = has title proper-->
      <rdamd:P30134>La Oficina</rdamd:P30134><!--rdamd:P30134 = has title of manifestation-->
      <rdamd:P30134>Das Büro</rdamd:P30134><!--rdamd:P30134 = has title of manifestation-->
      <rdamd:P30002>filmstrip</rdamd:P30002><!--rdamd:P30002 = has media type-->

The transform documentation needs to note this decision.

The complication in the MARC 21/ISBD encoding is that subfield $b (other title information) may contain non-title data. One method of parsing out such data is to identify ISBD punctuation in the value. In the example above, the string is parsed out into parallel data by the = sign and the end-stop is removed. The ISBD punctuation patterns, amended to remove parallel elements, are:

Title proper Title proper : other title information Title proper = title [of manifestation] Title proper = title : other title information Title proper : other title information = title : other title information Title proper : other title information = other title information

These patterns can be matched against a normalized concatenation of subfields $a, $b, $n, and $p in order of appearance. As indicated by @tmqdeborah, ISBD title proper is a concatenation of $a, $n, and $p.

Basically, the space-colon-space indicates other title information, and space-equals-space indicates title of manifestation.

The transform is not quite right for subfields $n and $p.

 <fake:marcfield>F245 00 $a Annual report of the Minister of Supply and Service Canada under the Corporations and Labour Unions Returns Act. $n Part II, $p Labour unions = $b Rapport annuel du ministre des Approvisionnements et services Canada présenté sous l'empire et des syndicates ouvriers. $n Partie II, $p Syndicats ouvriers.</fake:marcfield>
      <rdamd:P30134>Annual report of the Minister of Supply and Service Canada under the Corporations and Labour Unions Returns Act, Part II, Labour unions</rdamd:P30134><!--rdamd:P30134 = has title of manifestation-->
      <rdamd:P30134>Rapport annuel du ministre des Approvisionnements et services Canada présenté sous l'empire et des syndicates ouvriers, Partie II, Syndicats ouvriers</rdamd:P30134><!--rdamd:P30134 = has title of manifestation-->
      <rdamd:P30142/><!--rdamd:P30142 = has other title information-->

The first title of manifestation should be title proper, and the other title information triple has been output with no value. There is no other title information in the example, but the triple should not be generated with a blank value.

GordonDunsire commented 4 weeks ago

@cspayne: I noticed some other issues in the sample output.

<rdamd:P30156>A report to the legislature for the year ..</rdamd:P30156><!--rdamd:P30156 = has title proper-->

Refine the removal of end-stop: if last three characters are stops, do not remove the end-stop:

<rdamd:P30156>A report to the legislature for the year ...</rdamd:P30156><!--rdamd:P30156 = has title proper-->

Three stops is the mark of omission, and what is omitted is the chronological or numerical designation of an issue of a serial. What is being described is the incomplete manifestation of all of the issues published to date. Another case is the numerical designation of a multiple-unit manifestation.

<rdawd:P10088>[Seventeen poems]</rdawd:P10088><!--rdawd:P10088 = has title of work-->

Remove surrounding brackets:

<rdawd:P10088>Seventeen poems</rdawd:P10088><!--rdawd:P10088 = has title of work-->

They are not part of the title, but an indicator of the source of the title. This comes from ISBD, but the stipulation is removed in ISBDM. The same applies to all title subfields transformed to RDA, including title proper, title of manifestation, etc. Other example:

<rdamd:P30156>Diary</rdamd:P30156><!--rdamd:P30156 = has title proper-->

The brackets indicate a title assigned by the cataloguer when there is not title assigned by a creator of the manifestation (publisher, etc.) or a creator of the works embodied in the manifestation (authors, etc.). RDA prefers to record this information in a note on manifestation, for example 'Title proper is assigned by the cataloguing agency.'. The transform should do this to preserve the information.

Generate boilerplate note on manifestation for each RDA manifestation title element with a value enclosed in brackets:

<rdamd:P30137>Title proper is assigned by the cataloguing agency.</rdamd:P30137><!--rdamd:P30137 = has note on manifestation >
<rdamd:P30137>Title of the manifestation is assigned by the cataloguing agency.</rdamd:P30137>
<rdamd:P30137>Variant title of the manifestation is assigned by the cataloguing agency.</rdamd:P30137>

It is not necessary to add a note on expression or work, because title of work, etc. are traditionally assigned by the cataloguing agency by derivation from title proper.

<rdamd:P30142>[announcement]</rdamd:P30142><!--rdamd:P30142 = has other title information-->

This is similar to a manifestation title element so the brackets should be removed. Generate a note on manifestation:

<rdamd:P30137>Other title information is assigned by the cataloguing agency.</rdamd:P30137><!--rdamd:P30137 = has note on manifestation-->
<rdamd:P30137>typescript</rdamd:P30137><!--rdamd:P30137 = has note on manifestation-->

The first word of a generated note should start with an upper-case letter, and the note should end with a stop.

Transform first letter of note using upper() function. Add stop to value if not present:

<rdamd:P30137>Typescript.</rdamd:P30137><!--rdamd:P30137 = has note on manifestation-->
tmqdeborah commented 3 weeks ago

@GordonDunsire Re. https://github.com/uwlib-cams/MARC2RDA/issues/115#issuecomment-2295292711

Don't forget about ' ;$b' which is used for 'subsequent titles', i.e., titles of embodied works in addition to the title of the work given in $a.

245 00 $a Morbid visions $h [sound recording] ; $b Bestial devastation / $c Sepultura.

245 10 $a Printing Presses and Publications Act 1984 (Act 301) & rules ; $b and, Deposit of Library Material Act 1986 (Act 331) : as at 20th October 2001 / $c compiled by Legal Research Board.

245 10 $a Idylle $h [sound recording] ; $b Alt-Wiener Tanz / $c Drdla.

245 12 $a A tour of insprection ; $b Mrs. Bathurst / $c by Rudyard Kipling.

245 14 $a The bagnios of Algiers ; $b and, The great Sultana : two plays of captivity / $c Miguel de Cervantes ; edited and translated by Barbara Fuchs and Aaron J. Ilika.

GordonDunsire commented 3 weeks ago

@tmqdeborah: '; $b' applies to aggregate manifestations, which are out of scope, and I am talking at the specific punctuation '=' as in '= $b'.

'; $b' does, indeed, indicate subsequent titles where there is no collective title (ISBD 1.1.5.2). I don't know why MARC 21 splits the encoding between subfield $a and $b, but when an aggregate is being transformed, the other title information part of subfield $b can be identified by an embedded space-colon-space, as in your 'bagnios of Algiers' and 'Printing Presses ...' examples. I think that subfield $a and the subfield $b value up to space-colon-space, or whatever punctuation appears at the end of the subfield, are concatenated with semicolon-space to form the title proper; for example 'Morbid visions; Bestial devastation' or 'The bagnios of Algiers; and, The great Sultana'.

cspayne commented 2 weeks ago

I believe I've gotten the punctuation sorted now. Test data is here.

cspayne commented 2 weeks ago

@GordonDunsire For the "assigned by the cataloguing agency" notes, should these be generated when any square brackets are present in the field? Or only when the whole field is enclosed with square brackets? For example, with F245 00 $k Letter, $f 1901 March 6, $b Dublin, to Henrik Ibsen, Kristiana [Oslo]. Should this result in a note on manifestation <rdamd:P30137>Other title information is assigned by the cataloguing agency.</rdamd:P30137> because of [Oslo]?

When a title of manifestation is a combined value from $a, $n, $p, and/or $s, should this note be produced if any of these subfields contain square brackets?

AdamSchiff commented 2 weeks ago

This kind of title is entirely cataloger-supplied I think, probably following guidelines for manuscript cataloging. So the bracketed [Oslo] is just there to explain that Kristiania was the earlier name for Oslo. I don't think that this title is other title information. You can't have other title information without a title proper. The whole thing is the title proper, and I don't see how you can remove the brackets around Oslo in this instance.

AdamSchiff commented 2 weeks ago

I believe I've gotten the punctuation sorted now. Test data is here.

There are still some errors with the punctuation:

official organ of the American Association for Cancer Research, Inc The period in Inc. should be retained probably, as Inc. typically is recorded with a period (but not always, dang it!). But I think we have to assume that the period has been transcribed as found and we don't add a second period in this situation.

Felsmechanik This appears in $b in MARC (other title information) but it is a parallel title proper and shouldn't it be mapped to has title of manifestation rather than other title information?

Love from Joy, Part III, 1987-1995, At the bungalow This is not correct. The $n and $p are not part of the title proper here, they are for the other title information. In this case we should have:

Love from Joy letters from a farmer’s wife. Part III, 1987-1995, At the bungalow The $n and $p must immediately follow the $a in order for them to be part of the title proper. **Répertoire des projets / CDT** This is not a statement of responsibility related to title proper. It is a parallel title proper and parallel statement of responsibility. Répertoire des projets should map to has title of manifestation or title proper. CDT should map to statement of responsibility (http://rdaregistry.info/Elements/m/P30117) or statement of responsibility related to title proper (http://rdaregistry.info/Elements/m/P30105)
AdamSchiff commented 2 weeks ago

sound recording</rdamd:P30002> I don't disagree with this mapping, but might it not be better to convert the AACR2 GMDs (general material designations) to RDA media type vocabulary? 245 $h [sound recording] = 337 $a audio $2 rdamt = rdamt:1001

We could make a list of the GMDs used in AACR and map them to RDA media types.

AdamSchiff commented 2 weeks ago
Le Bureau La Oficina Das Büro La Oficina and Das Büro are parallel titles proper. RDA element parallel title proper (http://rdaregistry.info/Elements/m/P30203) says: The following option is recommended. OPTION Do not use this element; use Manifestation: title proper instead to record a value. So why aren't we mapping the parallel titles to title proper rather than has title of manifestation?
AdamSchiff commented 2 weeks ago

Something got messed up here?:

F245 00 $a $h [sound recording] ; $b inet and orchestra / $c Claude Debussy. and here: F245 00 $a $h [sound recording] ; $b inet and orchestra / $c Claude Debussy. inet and orchestra
cspayne commented 2 weeks ago

Something got messed up here?:

fake:marcfieldF245 00 $a $h [sound recording] ; $b inet and orchestra / $c Claude Debussy.</fake:marcfield> rdawd:P10088/

and here:

fake:marcfieldF245 00 $a $h [sound recording] ; $b inet and orchestra / $c Claude Debussy.</fake:marcfield> rdamd:P30156/ rdamd:P30142inet and orchestra</rdamd:P30142>

Yes! Some of these test fields seem to be put together wrong or with varying punctuation in order to test different cases.

CECSpecialistI commented 2 weeks ago

That's a really good question, we should probably be using title proper not title of manifestation? Was there a reason we decided against that? And record it as a value of nomen:nomen string?

GordonDunsire commented 2 weeks ago

@GordonDunsire For the "assigned by the cataloguing agency" notes, should these be generated when any square brackets are present in the field? Or only when the whole field is enclosed with square brackets? For example, with F245 00 $k Letter, $f 1901 March 6, $b Dublin, to Henrik Ibsen, Kristiana [Oslo]. Should this result in a note on manifestation <rdamd:P30137>Other title information is assigned by the cataloguing agency.</rdamd:P30137> because of [Oslo]?

When a title of manifestation is a combined value from $a, $n, $p, and/or $s, should this note be produced if any of these subfields contain square brackets?

@cspayne: I assume subfield $k maps separately to category of manifestation. So if there is no subfield $a, and subfield $k appears first, then I think it is safe to assume that subfield $k is taking the place of $a, and the implications are as @AdamSchiff suggests. So the appropriate concatenation of subfields up to a statement of responsibility in subfield $c should be transformed to title proper, with the corresponding 'assigned by the cataloguing agency' note. I think the brackets should be removed in this situation, because the whole title is enclosed in implied brackets.

For subfield $a with $n, $p, and/or $s, I think the brackets should be removed and the appropriate note should be produced (what harm does it do?). I guess you could refine the note to isolate specific parts, for example 'Name of part/section of a work in other title information is assigned by the cataloguing agency.' for subfield #p, but I don't think this works for all subfield components.

GordonDunsire commented 2 weeks ago

@AdamSchiff, @CECSpecialistI: So why aren't we mapping the parallel titles to title proper rather than has title of manifestation?

I think they should map to multiple titles proper, if they are parallel to subfield $a.

This causes an issue for preferred titles of primary expression and work; I guess the resolution is to use the 'first' title proper.

The only reason to nomen-ize a title is if you want to make a statement about it. So an alternative or additional way of dealing with brackets, or data provenance for appellations in general, is to nomen-ize the title or title proper:

<manifestation> rdamo:P30156 <titleProperNomen> .
<titleProperNomen> rdand:P80068 "Letter, 1901 March 6, Dublin, to Henrik Ibsen, Kristiana Oslo" .
<titleProperNomen> rdand:P80071 "Oslo interpolated by cataloguing agency . // note on nomen
<titleProperNomen> rdand:P80073 "Cataloguing agency" . // assigned by agent

In general, this is the best approach to appellation data provenance for open world cataloguing. But I think it is overkill for the transform: it provides maximum extraction of 'meaning', but it increases processing time without much benefit. This is why title and name appellations are unlikely to be nomen-ized = brought under authority control.

CECSpecialistI commented 2 weeks ago

Could we use the property title proper and record it as a string value without nomen-izing it?

cspayne commented 2 weeks ago

@GordonDunsire @CECSpecialistI @AdamSchiff I'm working on updating the Google Sheet (and the code) so we can continue reviewing 245 and track the changes we are making. Should we only be looking for ISBD punctuation when LDR 18 = a or i? Or would it be better to always treat certain punctuation (= : / etc.) as isbd regardless of the LDR 18 value?

GordonDunsire commented 2 weeks ago

@cspayne: I guess some of this punctuation will be intrinsic to the manifestation, and not supplied by the cataloguer via ISBD, etc. So it would be a mistake to treat it as ISBD punctuation, and I think we need to pay attend to the LDR value. Any other opinions?

GordonDunsire commented 2 weeks ago

@CECSpecialistI asks 'Could we use the property title proper and record it as a string value without nomen-izing it?' Yes, that is what am I suggesting.

AdamSchiff commented 2 weeks ago

@cspayne: I guess some of this punctuation will be intrinsic to the manifestation, and not supplied by the cataloguer via ISBD, etc. So it would be a mistake to treat it as ISBD punctuation, and I think we need to pay attend to the LDR value. Any other opinions?

Yes, in RDA we transcribe punctuation as found, so that will include titles with colons and slashes. As a made-up example: A guidebooks to Uluru / Ayers Rock / $c by John Smith. In AACR2 we were instructed to close up spaces around characters that were also ISBD punctuation, but in RDA that isn't the case.

lake44me commented 2 weeks ago

@cspayne Please note - before this mapping is ready for transform, the rows that get Rreproduction Conditions checked need to have that information added. Let @lake44me and @dchen077 know when the mappings/transform are otherwise reviewed/approved so she can add the condition information.

cspayne commented 1 week ago

@CECSpecialistI @GordonDunsire @AdamSchiff I'm in agreement with Gordon and my suggestion is to do the following:

  1. When ISBD punctuation is present (LDR18 = a or i), concatenate all subfields into one statement and map/transform based on ISBD punctuation. Remove ISBD punctuation (= : ; /) and retain all other punctuation.
  2. When ISBD punctuation is not present (LDR18 != a or i) process each subfield individually and map based on the subfield. Retain all punctuation.

Here is some test output from a template that processes 245 based on ISBD punctuation instead of by subfield.

There are some complications:

  1. As Adam has pointed out, in RDA punctuation is transcribed as found.

Yes, in RDA we transcribe punctuation as found, so that will include titles with colons and slashes. As a made-up example: A guidebooks to Uluru / Ayers Rock / $c by John Smith. In AACR2 we were instructed to close up spaces around characters that were also ISBD punctuation, but in RDA that isn't the case.

This may result in errors. In the example Adam provided, this would result in "Uluru" as a title proper and "Ayers Rock" as a statement of responsibility separate from "by John Smith".

  1. As we discussed in last week's meeting, there is no simple, fool-proof way to remove ending periods that were added while retaining those that are a part of the title statement. We need to decide whether it is better to keep them or remove them. If we remove them, we need to decide if we are removing periods at the end of all subfields or only the final ending period in the field.

  2. When ISBD punctuation is provided, are all subfields prior to the first ISBD punctuation mark part of the title proper? i.e. F245 03 $a Le Bureau $h [filmstrip] = $b La Oficina = Das Büro. Does this indicate that [filmstrip] is a part of the title proper? Or is this separated out? If so is that indicated by ISBD punctuation or only by the subfield?

There is no solution that does not require human eyes on these titles at some point, which we have previously noted in the Decisions Index.

AdamSchiff commented 1 week ago
  1. I am strongly opposed to any transformation that would treat "Ayers Rock" as a statement of responsibility. The subfield $c is the marker that indicates the statement of responsibility in this situation.
  2. I would retain periods at the end. Probably also at the end of all subfields, since the last word in a subfield could also be an abbreviation.
  3. The GMD in $h is not part of the title proper. It's not part of any title information. It's a cataloger interpolation to indicate the content/format of a resource and doesn't reflect either what appears on a resource or a cataloger title addition. In the MARC Bibliographic to RDA Mapping in the original Toolkit, 245 $h doesn't map to any RDA property.
cspayne commented 1 week ago
  1. I am strongly opposed to any transformation that would treat "Ayers Rock" as a statement of responsibility. The subfield $c is the marker that indicates the statement of responsibility in this situation.

    1. I would retain periods at the end. Probably also at the end of all subfields, since the last word in a subfield could also be an abbreviation.

    2. The GMD in $h is not part of the title proper. It's not part of any title information. It's a cataloger interpolation to indicate the content/format of a resource and doesn't reflect either what appears on a resource or a cataloger title addition. In the MARC Bibliographic to RDA Mapping in the original Toolkit, 245 $h doesn't map to any RDA property.

In this case, there are only some subfields where we should be programmatically checking for ISBD punctuation to determine which property to map to.

Does this seem like a better solution? If not, are there other suggestions on how to handle ISBD punctuation to account for the majority of the 245 fields we will encounter?

AdamSchiff commented 1 week ago

There is internal punctuation in front of $n and $p and between $n and $p (or do you consider this ending punctuation?). Examples:

245 00 Journal of environmental science and health. ǂn Part A, ǂp Toxic/hazardous substances & environmental engineering.

245 00 Journal of macromolecular science. ǂp Physics.

245 10 Iliad. ǂn Book III / ǂc edited by A.M. Bowie.

I think it might be possible for internal punctuation to be found within $n and $p, if the main title does not have a parallel title, but the part number and/or title do. Not sure if I can find a real life example right now, but this is a made-up one:

245 10 Countries of the world. $p Brazil = Brasil / $c by Adam.
I'm not completely sure if the above is correct, or if "Brasil" could be preceded by a second $p. It might could also be recorded as:
245 10 Countries of the world. $p Brazil = $b Countries of the world. $p Brasil / $c by Adam.

GordonDunsire commented 1 week ago

@cspayne:

  1. If processing is based on punctuation, the last occurrence of ISBD-specified punctuation should be used, not the first. This avoids the problem of embedded slashes in the title proper (@AdamSchiff's example) and outputs better values when the source has errors or unusual use of embedded punctution. The test example:
 <rdf:Description rdf:about="http://fakeIRI2.edu/245-test16man">
      <rdf:type rdf:resource="http://rdaregistry.info/Elements/c/C10007"/>
      <rdamo:P30139 rdf:resource="http://fakeIRI2.edu/245-test16exp"/><!--rdamo:P30139 = has expression manifested-->
      <fake:marcfield>F245 00 $a Project directory / / $c TDC = Répertoire des projets / CDT.</fake:marcfield>
      <!--Project directory / / TDC = Répertoire des projets / CDT.-->
      <rdamd:P30156>Project directory</rdamd:P30156><!--rdamd:P30156 = has title proper-->
      <rdamd:P30105>/ TDC</rdamd:P30105><!--rdamd:P30105 = has statement of responsibility relating to title proper-->
      <rdamd:P30156>Répertoire des projets</rdamd:P30156><!--rdamd:P30156 = has title proper-->
      <rdamd:P30105>CDT.</rdamd:P30105><!--rdamd:P30105 = has statement of responsibility relating to title proper-->
   </rdf:Description>

is better transformed as:

 <rdf:Description rdf:about="http://fakeIRI2.edu/245-test16man">
      <rdf:type rdf:resource="http://rdaregistry.info/Elements/c/C10007"/>
      <rdamo:P30139 rdf:resource="http://fakeIRI2.edu/245-test16exp"/><!--rdamo:P30139 = has expression manifested-->
      <fake:marcfield>F245 00 $a Project directory / / $c TDC = Répertoire des projets / CDT.</fake:marcfield>
      <!--Project directory / / TDC = Répertoire des projets / CDT.-->
      <rdamd:P30156>Project directory /</rdamd:P30156><!--rdamd:P30156 = has title proper-->
      <rdamd:P30105>TDC</rdamd:P30105><!--rdamd:P30105 = has statement of responsibility relating to title proper-->
      <rdamd:P30156>Répertoire des projets</rdamd:P30156><!--rdamd:P30156 = has title proper-->
      <rdamd:P30105>CDT.</rdamd:P30105><!--rdamd:P30105 = has statement of responsibility relating to title proper-->
   </rdf:Description>

The double slash appears to be an error.

  1. Retaining end-stops (periods) is causing all sorts of problems; c.f. de-duplication of nomens with and without end-stops. I still think the default should be to strip the end-stop when it is the last punctuation mark in the output value (after concatenation and other processing). This will result in errors for initialisms and abbreviations if they are the last 'word' of the value.

We can reduce the error rate by pattern-matching '.A.' for the last three characters and retaining the end-stop; this works for initialisms such as 'U.S.A.' and 'U.S.S.R.'

We can further reduce the error rate with a look-up table for common abbreviations such as 'Colo.', 'Jr.', 'ms.' etc. The table can be derived from published sources (abbreviations of states, etc.) and occurrences 'known' to be common in MARC 21 cataloguing.

  1. Subfield $h should be extracted before any other processing takes place. I recall we had a discussion on what to map it to, because the old AACR2 general material designation is a mix of RDA media/carrier type (e.g. 'microform') and RDA content type (e.g. 'sound recording') which is an expression element. My view is that it should map to rdamd:P30335 'has category of manifestation', with the enclosing brackets removed. No note is required.

Another apparent error in the test data:

<rdf:Description rdf:about="http://fakeIRI2.edu/245-test8man">
      <rdf:type rdf:resource="http://rdaregistry.info/Elements/c/C10007"/>
      <rdamo:P30139 rdf:resource="http://fakeIRI2.edu/245-test8exp"/><!--rdamo:P30139 = has expression manifested-->
      <fake:marcfield>F245 10 $a ETZ : $b ...</fake:marcfield>
      <!--ETZ : ...-->
      <rdamd:P30156>ETZ</rdamd:P30156><!--rdamd:P30156 = has title proper-->
      <rdamd:P30142>...</rdamd:P30142><!--rdamd:P30142 = has other title information-->
   </rdf:Description>

The triple-period in subfield $b is the mark of omission, and should have been included in subfield $a. The transform is operating correctly; the error is in the source data.

cspayne commented 1 week ago

@AdamSchiff @GordonDunsire Thank you for your continued input and review of the test data, it has been immensely helpful. I have done another updated test output based on your notes and have updated the Google Sheet as well.

All other errors pointed out previously should be corrected, unless I have missed it :)

cspayne commented 1 week ago

Wrong link, this is the updated 245 test output

GordonDunsire commented 6 days ago

@cspayne: I think subfield $k should map to 'has category of manifestation'.

Also, I see no reason to retain brackets around supplied values, even when partial. The cataloguer is not saying that the brackets are part of the title, etc., but indicating where they have interpolated external information. This is a form of data provenance which is accommodated by the note, and should not be embedded in the title, etc. If there is still disagreement, we should briefly discuss at a meeting.

Otherwise, the first few records are looking good; I'm still working my way through the rest.

I see that the retained end-stop/period is looking badder and badder ;-)

GordonDunsire commented 6 days ago

This one is a problem:

<rdf:Description rdf:about="http://fakeIRI2.edu/245-test14man">
      <rdf:type rdf:resource="http://rdaregistry.info/Elements/c/C10007"/>
      <rdamo:P30139 rdf:resource="http://fakeIRI2.edu/245-test14exp"/><!--rdamo:P30139 = has expression manifested-->
      <fake:marcfield>F245 00 $a Love from Joy : $b letters from a farmer’s wife. $n Part III, $p 1987-1995, At the bungalow.</fake:marcfield>
      <rdamd:P30156>Love from Joy</rdamd:P30156><!--rdamd:P30156 = has title proper-->
      <rdamd:P30142>letters from a farmer’s wife. Part III, 1987-1995, At the bungalow.</rdamd:P30142><!--rdamd:P30142 = has other title information-->
</rdf:Description>

Subfields $n and $p should be attached to $a as 'title proper'. I think this is indicated by the end-stop of subfield $b, which shows that it is a subtitle; this is correctly mapped to 'other title information', but should not be concatenated with the following subfields $n and $p. Can @AdamSchiff or a.n. other confirm?

GordonDunsire commented 6 days ago

Oops, I see that @AdamSchiff has already said that subfields $n and $p are NOT part of the title proper. Can we confirm this?

SitaKB commented 6 days ago

I also think that $n and $p are not part of title proper. This is a multipart publication. $n and $p has the title and information of the part

AdamSchiff commented 6 days ago

I think in this instance the $n and $p modify the other title information, not the title proper. The $b is not a modifier of Part III. It is Part III of the $b.

Adam

Adam L. Schiff Principal Cataloger University of Washington Libraries (206) 543-8409 @.***


From: GordonDunsire @.> Sent: Monday, September 9, 2024 3:23 AM To: uwlib-cams/MARC2RDA @.> Cc: Adam L Schiff @.>; Mention @.> Subject: Re: [uwlib-cams/MARC2RDA] 245 title statement (Issue #115)

This one is a problem:

F245 00 $a Love from Joy : $b letters from a farmer’s wife. $n Part III, $p 1987-1995, At the bungalow. Love from Joy letters from a farmer’s wife. Part III, 1987-1995, At the bungalow. Subfields $n and $p should be attached to $a as 'title proper'. I think this is indicated by the end-stop of subfield $b, which shows that it is a subtitle; this is correctly mapped to 'other title information', but should not be concatenated with the following subfields $n and $p. Can @AdamSchiff or a.n. other confirm? — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: ***@***.***>