Closed gerontakos closed 1 year ago
Source MARC data is now placed alongside corresponding RDA data. You can search for 'Fxxx' ('xxx' for MARC field tags, F245, F500, etc.) to get to the part you'd like to review. An index to the fields represented in the 2 datasets:
F020 F043 F245 F264 F306 F336 F337 F338 F340 F380 F382 F490 F500 F502 F504 F561 F880
Example: searching for 'F336' content type
Here's my initial thoughts on dataset 1. Overall: looking good.
More specific comments, in order of appearance in the file:
<fake:rdawP10065>880-01 ʻAbd al-Razzāq, Zaynab, author.</fake:rdawP10065>
Is this a temporary transform artefact?
<rdamd:P30004>(ISBN) 9789777952569 (paperback)</rdamd:P30004><!--rdamd:P30004 = has identifier for manifestation-->
This is a result of applying this option. But ISBN Registry is a recognized VES, so the preceding option and is better (). This would result in "9789777952569".
<rdamd:P30134>Iḥsān ʻAbd al-Qaddūs</rdamd:P30134><!--rdamd:P30134 = has title of manifestation-->
This is not necessary because it can be entailed from the next statement:
<rdamd:P30156>Iḥsān ʻAbd al-Qaddūs</rdamd:P30156><!--rdamd:P30156 = has title proper-->
<rdamd:P30111>$a al-Qāhirah : $b al-Dār al-Miṣrīyah al-Lubnānīyah, $c 2020.</rdamd:P30111><!--rdamd:P30111 = has publication statement-->
It is not necessary to include the M21 encoding; value should be: "al-Qāhirah : al-Dār al-Miṣrīyah al-Lubnānīyah, 2020"
<rdamd:P30002>n</rdamd:P30002><!--rdamd:P30002 = has media type-->
Notation from id.loc.gov vocabulary: why? The data is already in the M21 record, and the code is meaningless to an end-user. If the display is to be suppressed, how does the system know which value to suppress? Why not publish mappings between the RDA and M21 vocabularies? (Ask RSC Tech WG to do it, or send them a draft).
Supplementary: why not record the RDA IRI for the RDA media type in place of, or in addition to, the preferred label:
`
Gordon wrote:
<fake:marcfield<F245 10 $a Developing digital project delivery routines around frequent disruptions : $n numer 8 $c Hamid Abdirad.</fake:marcfield> rdawd:P10088Developing digital project delivery routines around frequent disruptions</rdawd:P10088>
$n should be part of the title:
<rdawd:P10088>Developing digital project delivery routines around frequent disruptions, numer 8</rdawd:P10088>
($p is also part of the title when present, and comes after $n data if present.)
I don't disagree with Gordon. As shown, the punctuation is not correct and should be Developing digital project delivery routines around frequent disruptions. $n Numer 8
BUT: where is Numer 8 coming from? I just looked at the OCLC record for this thesis and there is no such data:
100 1 Abdirad, Hamid, ǂe author. ǂ1 http://www.wikidata.org/entity/Q101242292 24510 Developing digital project delivery routines around frequent disruptions : ǂb how do AEC organizations respond to disruptive information exchange requirements? / ǂc Hamid Abdirad. 264 1 [Seattle] : ǂb [University of Washington Libraries], ǂc [2020] 264 4 ǂc ©2020
It appears that the record has been corrupted somehow in your transformation. "numer 8" is not present at all in the OCLC record that we created. If this is corrupted, then there is a good chance that other data is as well.
Here are my comments on Dataset 2:
Is this correctly encoded in MARC? The non-repeat of "Concertino" is a problem.
I think we should strip off the brackets. They indicate that the data was taken from outside of the manifestation being described, which is provenance that is no longer relevant to retain.
I think 264 2nd ind = 4 should transform to rdamd:P30007 "has copyright date". The copyright symbol, the phonogram symbol, the string "(c)", the string "(p)", the string "copyright", the string "phonogram copyright", the letter "c", or the letter "p" should be stripped from the value:
<rdamd:P30007>2017</rdamd:P30007><!--rdamd:P30007 = has copyright date-->
<rdaed:P20215>For trumpet; piano [includes percussion staff]. Total performers: 2.</rdaed:P20215><!--rdaed:P20215 = has medium of performance of musical content-->
I think parentheses should be used in place of brackets. Brackets have a specific, albeit legacy, meaning and we should avoid potential confusion if we can.
5.<fake:marcfield>F264 #2 $a [Milwaukee, Wisconsin] : $b distributed in North and South America exclusively by Hal Leonard.</fake:marcfield>
`
GitHub Markdown information
To use the #
or @
character without inserting a link to an issue or user, respectively, these can be escaped with a preceding backslash (\
). Enclosing in backticks (``) as I've done here works too.
2023-05-03 discussed until #7 in Gordon's comment
Hello all, a dataset has been produced for the group to review. It is in github at MARC2RDA/Working Documents/transformationCode/outputDataForReview. There is the input MARC dataset, from which an RDF dataset was derived and saved in two versions, one without labels and another with labels. The one with labels is probably easier to review; that's dataset-1-withLabels.rdf. You can review it before the upcoming meeting or we can review it together at our Wednesday morning meeting on April 26. Please remember only a few fields have been sent to output. We thought it might be good to review some data before we code more fields, make sure we're on the right track.