uwlib-cams / MARC2RDA

mapping between MARC21 and RDA-RDF
Creative Commons Zero v1.0 Universal
32 stars 2 forks source link

700 #15

Open CECSpecialistI opened 2 years ago

CECSpecialistI commented 2 years ago

Map 7XX spreadsheet

CECSpecialistI commented 2 years ago

I will start working on the 700 field now.

CECSpecialistI commented 2 years ago

@AdamSchiff the mapping from MARC 700** $0 to rdaw:P10002 isn't making sense to me. Wouldn't this always be an identifier for a related work/added entry, rather than for the work being described in the MARC record? I ended up using this property in the transformation notes for other properties such as "part of work", but I don't think it can stand alone in a mapping for this field. Do you see something I don't?

AdamSchiff commented 2 years ago

There is no RDA property for identifier of related work. The only identifier properties under work are P10002, P10079, P10366, and P10400:

[cid:16ca752e-b1c3-4773-ba98-af850a51370f]

$0 of 700** could refer either to an identifier for an agent or a related work.

Adam

Adam L. Schiff Principal Cataloger University of Washington Libraries (206) 543-8409 @.***


From: Crystal Clements @.> Sent: Tuesday, November 30, 2021 4:02 PM To: uwlib-cams/MARC2RDA @.> Cc: Adam L Schiff @.>; Mention @.> Subject: Re: [uwlib-cams/MARC2RDA] 700 (Issue #15)

@AdamSchiffhttps://github.com/AdamSchiff the mapping from MARC 700** $0 to rdaw:P10002 isn't making sense to me. Wouldn't this always be an identifier for a related work/added entry, rather than for the work being described in the MARC record? I ended up using this property in the transformation notes for other properties such as "part of work", but I don't think it can stand alone in a mapping for this field. Do you see something I don't?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/uwlib-cams/MARC2RDA/issues/15#issuecomment-983137312, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ADFBVB65PUB2HSJZCYRB4CTUOVQYXANCNFSM5G3VX7BQ. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

CECSpecialistI commented 2 years ago

Right, I'm thinking for P10002 to work it has to look like this: [Work described in MARC data]-->has related work of work (rdaw:P10198 or subproperty)-->[That other work] [That other work]-->has identifier for work(P10002)-->"value of $0"

CECSpecialistI commented 2 years ago

but then figuring out what's a work and what's an agent is hard too. Will all the works have a $t?

AdamSchiff commented 2 years ago

Hmm, I'm not always sure we are talking about work to work relationships. Consider:

100 1 Shakespeare 240 10 Hamlet. $l French $s (Clements) $0 .... 700 1 $i Translation of: $a Shakespeare. $t Hamlet. $0 ....

In this MARC record, we are describing a manifestation of a French expression of a work and relating it to the work.

Or we could do the same in an authority:

010 .... 024 ..... 100 1 Shakespeare. $t Hamlet. $l French $s (Clements) 500 1 $w r $i Translation of: $a Shakespeare. $t Hamlet

Here again we are relating an expression to a work, not a work to a work.

Adam

Adam L. Schiff Principal Cataloger University of Washington Libraries (206) 543-8409 @.***


From: Crystal Clements @.> Sent: Tuesday, November 30, 2021 4:41 PM To: uwlib-cams/MARC2RDA @.> Cc: Adam L Schiff @.>; Mention @.> Subject: Re: [uwlib-cams/MARC2RDA] 700 (Issue #15)

Right, I'm thinking for P10002 to work it has to look like this: [Work described in MARC data]-->has related work of work (rdaw:P10198 or subproperty)-->[That other work] [That other work]-->has identifier for work(P10002)-->"value of $0"

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/uwlib-cams/MARC2RDA/issues/15#issuecomment-983168558, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ADFBVB5LAARLX7MLSEF3BN3UOVVMZANCNFSM5G3VX7BQ. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

AdamSchiff commented 2 years ago

All works in a 700 MUST have a $t. Without $t they are agents.

Adam L. Schiff Principal Cataloger University of Washington Libraries (206) 543-8409 @.***


From: Crystal Clements @.> Sent: Tuesday, November 30, 2021 4:42 PM To: uwlib-cams/MARC2RDA @.> Cc: Adam L Schiff @.>; Mention @.> Subject: Re: [uwlib-cams/MARC2RDA] 700 (Issue #15)

but then figuring out what's a work and what's an agent is hard too. Will all the works have a $t?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/uwlib-cams/MARC2RDA/issues/15#issuecomment-983169080, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ADFBVB2PUZLPJLVFS5AHLYDUOVVQFANCNFSM5G3VX7BQ. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

CECSpecialistI commented 2 years ago

I didn't even think of that. So there's really no way to map the 700** $0 to has identifier for work because it could always be an expression as well. We need a value in the $i to distinguish works from expressions?

AdamSchiff commented 2 years ago

Yes, and/or $4.

Adam

Adam L. Schiff Principal Cataloger University of Washington Libraries (206) 543-8409 @.***


From: Crystal Clements @.> Sent: Tuesday, November 30, 2021 4:52 PM To: uwlib-cams/MARC2RDA @.> Cc: Adam L Schiff @.>; Mention @.> Subject: Re: [uwlib-cams/MARC2RDA] 700 (Issue #15)

I didn't even think of that. So there's really no way to map the 700** $0 to has identifier for work because it could always be an expression as well. We need a value in the $i to distinguish works from expressions?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/uwlib-cams/MARC2RDA/issues/15#issuecomment-983174621, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ADFBVB2KZ32FQ4CG6PZYT2DUOVWTLANCNFSM5G3VX7BQ. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

gerontakos commented 2 years ago

The more I think about this…

I’m thinking the $0 is the identifier for an authority record. I do hope IRIs for Works are not entered in the $0, they should be in $1. A Work is an RWO; an authority record can be a Work in itself (it is in fact an RWO), but the identifier represents the record, not the Work the Authority-record-as-a-Work describes. However, in usual practice, authority records are not classed as Works but, in my experience, as something like madsrdf:Authority; I haven’t seen them classed as rdac:Work or bf:Work, so referring to authority-record-as-a-Work is not helpful! Sorry...

If this is true – that $0 can only be text strings (Nomens) or IRIs identifying authority records – I don’t think you can map to a property that has a range of rdac:Work and assign a value that is taken from any $0. I don’t think you can even take the identifier as text (Nomen) and make it the value of P10002 (identifier for work) because it is not the identifier for a related work but for an authority record. The only piece of a 700 that can be an “identifier for the work” would be the actual string in $t itself. That is the appellation for the work, not the value in $0. Right?

In addition, I believe we have to account for the cases where (1) the value of $0 is text, versus (2) the value of $0 is an IRI. In addition to that, I believe the $0 was created in 2007? In 2009, I believe, the subfield was redefined. How do we map $0 values before the redefinition?

Finally, when there is a 700 $t, I’ve seen $0 values that point only to the name and not to the authority record for the work referenced in $t. So the presence of a $t does not necessarily mean a related work or expression is being referenced. How can we tell? Maybe only the $i and $4 can reveal that a work is being referenced? I'm not sure, but $t in itself does not seem sufficient.

AdamSchiff commented 2 years ago

So much to unpack!

Yes, I agree with you that $0 is an identifier for an authority record.

And yes, $0 can hold a string or a URI.

Examples:

700 1# $i Motion picture adaptation of (work): $a Kushner, Tony. ǂt Angels in America ǂ0 (DLC)no2017028402

700 1# $i Motion picture adaptation of (work): $a Kushner, Tony. ǂt Angels in America ǂ0 http://id.loc.gov/authorities/names/no2017028402

700 1# $i Motion picture adaptation of (work): $a Kushner, Tony. ǂt Angels in America ǂ0 (DLC)no2017028402 ǂ0 http://id.loc.gov/authorities/names/no2017028402

All of the above are correct.

It has been done, but it is not correct, to use $0 to identify just the $a portion of a work/expression access point. The PCC best practices (https://www.loc.gov/aba/pcc/taskgroup/linked-data-best-practices-final-report.pdf) explicitly say not to do this. Page 6:

  1. Each MARC field for an access point refers to one object.

  2. URIs are not given for portions of access points. Subfields implying an object other than that specified by the field as a whole (e.g., a name within a name-title access point) should not be given a URI within the same field. While systems may be able to parse out elements of a string internally and associate them with distinct URIs, the MARC format cannot itself convey discrete associations within the same access point. Only URIs corresponding to the full access point should be communicated when MARC data is exported.

In the example below, $0 must refer to the work, not to the composer.

700 1# $a Beethoven, Ludwig van, $d 1770-1827. $t Veränderungen über einen Walzer $0 http://id.loc.gov/authorities/names/n81127885

In the example below, $0 must refer to the entire subject heading string, not to one or more partial components.

650 #0 $a Gardening $x Equipment and supplies $x Marketing $0 http://id.loc.gov/authorities/subjects/sh85053091

"How do we interpret $0 values before the redefinition?"

Before the redefinition, the only valid values were of the kind (DLC)no2017028402, that is, a parenthetical code for the organization assigning the identifier followed by the alphanumeric string of the identifier.

"The only piece of a 700 that can be an “identifier for the work” would be the actual string in $t itself. That is the appellation for the work, not the value in $0. Right?"

Hmmm, the $t (or $t, $n, and/or $p) represents only the "preferred title of work". I don't think $0 identifies the preferred title, it identifies the work represented by the complete access point for the work.

P.S. Is it ok to reply via email to posts in Github?

Adam

Adam L. Schiff Principal Cataloger University of Washington Libraries (206) 543-8409 @.***


From: gerontakos @.> Sent: Friday, January 7, 2022 6:15 PM To: uwlib-cams/MARC2RDA @.> Cc: Adam L Schiff @.>; Mention @.> Subject: Re: [uwlib-cams/MARC2RDA] 700 (Issue #15)

The more I think about this…

I’m thinking the $0 is the identifier for an authority record. I do hope IRIs for Works are not entered in the $0, they should be in $1. A Work is an RWO; an authority record can be a Work in itself (it is in fact an RWO), but the identifier represents the record, not the Work the Authority-record-as-a-Work describes. However, in usual practice, authority records are not classed as Works but, in my experience, as something like madsrdf:Authority; I haven’t seen them classed as rdac:Work or bf:Work, so referring to authority-record-as-a-Work is not helpful! Sorry...

If this is true – that $0 can only be text strings (Nomens) or IRIs identifying authority records – I don’t think you can use a property that has a range of rdac:Work and assign a value that is taken from any $0. I don’t think you can even take the identifier as text (Nomen) and make it the value of P10002 (identifier for work) because it is not the identifier for a related work but for an authority record. The only piece of a 700 that can be an “identifier for the work” would be the actual string in $t itself. That is the appellation for the work, not the value in $0. Right?

In addition, I believe we have to account for the cases where (1) the value of $0 is text, versus (2) the value of $0 is an IRI. In addition to that, I believe the $0 was created in 2007? In 2009, I believe, the subfield was redefined. How do we interpret $0 values before the redefinition?

Finally, when there is a 700 $t, I’ve seen $0 values that point only to the name and not to the authority record for the work referenced in $t. So the presence of a $t does not necessarily mean a related work or expression is being referenced. How can we tell? Maybe only the $i and $4 can reveal that a work is being referenced? I'm not sure, but $t in itself does not seem sufficient.

— Reply to this email directly, view it on GitHubhttps://github.com/uwlib-cams/MARC2RDA/issues/15#issuecomment-1007868048, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ADFBVBZ4LXANXJO3Y6K3AXDUU6M5PANCNFSM5G3VX7BQ. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub. You are receiving this because you were mentioned.Message ID: @.***>

gerontakos commented 2 years ago

OK that makes sense, but I still have some vague areas.

I wasn't trying to figure out the 700 field (Crystal's assignment) but, rather, to establish some premises for treating $0.

Some premises for going forward with the alignment:

  1. $0 IRIs or text strings can never be used to represent an rdac:Work or rdac:Expression -- unless the referenced authority is typed as an rdac:Work or rdac:Expression. This means we cannot map the value of the $0 to the value of an RDA/LRM/RDF property with a range that expects either an rdac:Work or rdac:Expression. Most authorities we use are typed as follows: a. madsrdf:Authority b. madsrdf:NameTitle c. skos:Concept

  2. Choose one: a. $0 identifiers represent the full object described by the given field and all of its subfields, not just an object represented in one of the single subfields. b. $0 cannot be assumed to be the object represented by all the subfields or an object represented by a single subfield because actual practice is not consistent enough to make the determination.

  3. 700 $0 (both IRIs and text) will not be useful in the MARC-to-RDA alignment (unless there is $1) when the 700 field represents an authority that describes a related work or expression: a. It represents an authority record for a related work -- but we do not know the IRI of the related work and therefore cannot make the statement. b. We cannot use the $0 value as a surrogate for the related Work of Expression because it represents an authority record.

  4. A related work or expression represented in a 700 field can be referenced in RDA/LRM/RDF by constructing an access point (an RDF Literal) using the pertinent subfields. So either: a. We can create RDA/LRM/RDF triples from the 700 referencing a related work or expression if the target property's range allows a literal value (entering the literal as the direct value of the target property). b. We can create RDA/LRM/RDF triples from the 700 referencing a related work or expression if the target property's range expects an rdac:Nomen (creating a node for the rdac:Nomen with the constructed access point the value of rdan:nomenString).

I think that incorporates most of the considerations. I could attempt to manually transform the examples if that would be helpful.

--Theo

CECSpecialistI commented 2 years ago

Discussion around $1's and $0's beyond the context of the 700 field can be found here: https://github.com/uwlib-cams/MARC2RDA/issues/327

pennylenger commented 1 month ago

@tmqdeborah Hi, Deborah, I am trying to put the mapping into our Google Sheet. And I have a question about $0 and $1 Is $1 and $0 treated the same in 7XX as in 6XX. If there is a $1, we use it as object and attach heading as AP or AAP if there is a source If there is a $0, we convert it to a rwo uri based on some pattens if available, then use it as object and attach heading as AP or AAP if there is a source. If there is no $0 or $1, we mint a uri, then use it as object and attach heading as AP or AAP if there is a source. Do we use any $1 or $0, or do we approve some sources?

tmqdeborah commented 1 month ago

@tmqdeborah Hi, Deborah, I am trying to put the mapping into our Google Sheet. And I have a question about $0 and $1 Is $1 and $0 treated the same in 7XX as in 6XX.

I'm sorry but I cannot help with this. I have not been following the $0 and $1 discussions and decisions.

It would be great if the decisions could be pulled together in a table for each field, but I'm not going to be able to tackle that project. Must get Aggregate markers and Access Points done.

GordonDunsire commented 1 month ago

@pennylenger:

Is $1 and $0 treated the same in 7XX as in 6XX: Yes

If there is a $1, we use it as object and attach heading as AP or AAP if there is a source: Yes (if $1 is from an approved source). Use AAP for the fields that @AdamSchiff (?) identified as requiring an LC heading (LCNAF/LCSH) and the source is an LC source; otherwise use AP.

If there is a $0, we convert it to a rwo uri based on some pattens if available, then use it as object and attach heading as AP or AAP if there is a source: Yes, if source is approved.

If there is no $0 or $1, we mint a uri, then use it as object and attach heading as AP or AAP if there is a source: Yes. Use AAP if source and heading are LC; otherwise use AP.

Do we use any $1 or $0, or do we approve some sources?: We need to approve all sources (it's the wild west out there :-)

pennylenger commented 1 week ago

As Adam mentioned https://github.com/uwlib-cams/MARC2RDA/discussions/450#discussioncomment-10001198: There won’t be a $2 in access point and subject fields for agents constructed according to RDA (or AACR2 or AACR). The assumption in 1XX, 7XX, and 600-611 with second indicator 0 is that the agents are either in the LC/NACO authority file in the same form or that they are constructed in accordance with RDA/AACR2/AACR. This isn’t always true, but that’s the basic assumption.

And Gordon said above: If there is a $1, we use it as object and attach heading as AP or AAP if there is a source. Use AAP for the fields that @AdamSchiff identified as requiring an LC heading (LCNAF/LCSH) and the source is an LC source; otherwise use AP.

Question 1: for a 100 or 700 with no $2, since it is supposed to be in or constructed according to LCNAF, can we say it is AAP, not AP because it has a source, even if we are not sure whether it is actually authorized.

Question 2: in subject heading, if there is a $1 for a field with no qualifier subfields, we add the RDA AAP statement to the subfield $1 IRI as subject, with the value as a string object, no nomen, because we already 'know' the source from the IRI. If we think the same way for 100 or 700, if there is a uri in $1, no matter there is $2 or not, we can attach the heading as AAP. Is it the same for 100 and 700? But the heading is constructed in NAF, the the source of the uri may be another like wikidata or orcid. For example: 100 1#$aRiera, Jose Joaquin, $d1969- $eauthor $1https://orcid.org/0000-0002-8405-8386

AdamSchiff commented 1 week ago
  1. I think the assumption is that what is in 1XX 0 or 7XX 0 is an authorized access point, even if there's no authority authorizing it. The 1XX and 7XX are supposed to be for controlled access points, not for any kind of access point. A variant access point should not be put in a 1XX or 7XX.
  2. In the example, the $1 does not authorize the access point in $a/$d. The assumption is that this is authorized in the LC/NAF or is constructed according to AACR2 or RDA and associated documentation (like LCRIs for AACR2 and LC-PCC Policy Statements for RDA). The ORCID link presents the name as Jose Riera. It isn't an authority or controlled form of name.
GordonDunsire commented 1 week ago

For 2., to clarify: the value in subfield $1 is assumed to be the IRI of the entity referenced in the field. It does not matter what the result of de-referencing or linking to the IRI is; what matters is that the IRI is for an instance of an RDA or RDA-compatible entity. If that is the case, we don't have to mint a transform IRI.

In the example, I assume that de-referencing returns the following statement in a bunch of other statements:

<https://orcid.org/0000-0002-8405-8386> rdfs:label "Jose Riera" . // or skos:prefLabel or similar

But the transform can extract more information from the record:

<https://orcid.org/0000-0002-8405-8386> rdaad:P50411 "Riera, Jose Joaquin, 1969-" . // has authorized access point for person

Or, because we know (as @AdamSchiff says) that the scheme of the AAP is LCNAF:

<https://orcid.org/0000-0002-8405-8386> rdaad:P50411 <transform/nomen/blah> . // has authorized access point for person
<transform/nomen/blah> rdand:P80068 "Riera, Jose Joaquin, 1969-" . // has nomen string
<transform/nomen/blah> rdano:P80069 <IRI for LCNAF> . // has scheme of nomen

Subfield $1 is assumed to be an IRI for a rwo, not an appellation of a rwo which should be encoded in subfield $0.

However, if the ORCID entity for Person covers fictitious or legendary persons, non-human agents, etc., we cannot use the subfield $1 IRI and the transform should mint its own. This covers cases such as

100 1#$aMouse, Mickey, $eauthor $1https://orcid.org/https://orcid.org/0000-0003-2255-2095

This ORCID ID, and three others for the same name 'Mickey Mouse', appear to be deprecated because they return no public information; that is, nothing is returned on de-referencing. The question, then, is does ORCID now confine its scope to real-world persons?

pennylenger commented 1 week ago

@AdamSchiff Thank you Adam, for 1XX 0 or 7XX 0, do you mean the first indicator is 0? there is no second indicator 0.

AdamSchiff commented 1 week ago

I meant the first indicator.

Adam L. Schiff Principal Cataloger University of Washington Libraries (206) 543-8409 @.***


From: pennylenger @.> Sent: Thursday, September 12, 2024 8:31 AM To: uwlib-cams/MARC2RDA @.> Cc: Adam L Schiff @.>; Mention @.> Subject: Re: [uwlib-cams/MARC2RDA] 700 (Issue #15)

@AdamSchiffhttps://urldefense.com/v3/__https://github.com/AdamSchiff__;!!K-Hz7m0Vt54!jNo31p9R7R5-SsANDYAAWr1v0u1UdgvnhUOzZDuUweb6H2E4tnVq3HVDo6kPjfhMJ-waVs9wqRc4sZbI_4Wx48c$ Thank you Adam, for 1XX 0 or 7XX 0, do you mean the first indicator is 0? there is no second indicator 0.

— Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https://github.com/uwlib-cams/MARC2RDA/issues/15*issuecomment-2346617818__;Iw!!K-Hz7m0Vt54!jNo31p9R7R5-SsANDYAAWr1v0u1UdgvnhUOzZDuUweb6H2E4tnVq3HVDo6kPjfhMJ-waVs9wqRc4sZbI4hPHXrY$, or unsubscribehttps://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/ADFBVB7KTTRVNFLMX6TL5ZDZWGXVDAVCNFSM6AAAAABMPBO5RWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNBWGYYTOOBRHA__;!!K-Hz7m0Vt54!jNo31p9R7R5-SsANDYAAWr1v0u1UdgvnhUOzZDuUweb6H2E4tnVq3HVDo6kPjfhMJ-waVs9wqRc4sZbIjPzVPgs$. You are receiving this because you were mentioned.Message ID: @.***>

pennylenger commented 1 week ago

100 First Indicator Type of personal name entry element 0 - Forename 1 - Surname 3 - Family name Why is 0 - Forename specified as AAP. How about 1 - Surname and 3 - Family name? are they not?

pennylenger commented 1 week ago

This ORCID ID, and three others for the same name 'Mickey Mouse', appear to be deprecated because they return no public information; that is, nothing is returned on de-referencing. The question, then, is does ORCID now confine its scope to real-world persons?

I couldn't find the scope documentation, but ORCID is mainly for researchers, so I think they are a subtype of RDA Person. However, anyone can create a record here, and I can also create a record for Mickey Mouse, so I'm not sure.

CECSpecialistI commented 1 week ago

A lot of ORCID ID's don't return any public information due to privacy settings. It doesn't necessarily mean they are deprecated.

AdamSchiff commented 1 week ago

Yes, all of them would be AAPs if in a 100 or 700. In records where 040 $b = eng, the presumption is that the names are in the LC/NACO NAF or are formulated according to the rules in the 008 or 040 $e (generally, AACR2 or RDA). For a German library, the assumption is also that they are AAPs but would be found in the German authority file.

Adam

Adam L. Schiff Principal Cataloger University of Washington Libraries (206) 543-8409 @.***


From: pennylenger @.> Sent: Thursday, September 12, 2024 10:27 AM To: uwlib-cams/MARC2RDA @.> Cc: Adam L Schiff @.>; Mention @.> Subject: Re: [uwlib-cams/MARC2RDA] 700 (Issue #15)

100 First Indicator Type of personal name entry element 0 - Forename 1 - Surname 3 - Family name Why is 0 - Forename specified as AAP. How about 1 - Surname and 3 - Family name? are they not?

— Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https://github.com/uwlib-cams/MARC2RDA/issues/15*issuecomment-2346854653__;Iw!!K-Hz7m0Vt54!juYbgEAZpzQwJ4TSr9NSkN7zJrPpV7yaGdYpnjnY9-t1cDd0JrtMbfcR3dIpmJQQqs4GgzNj5VoU3xojUMYd890$, or unsubscribehttps://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/ADFBVB7LU5XUWBALLACM7C3ZWHFHDAVCNFSM6AAAAABMPBO5RWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNBWHA2TINRVGM__;!!K-Hz7m0Vt54!juYbgEAZpzQwJ4TSr9NSkN7zJrPpV7yaGdYpnjnY9-t1cDd0JrtMbfcR3dIpmJQQqs4GgzNj5VoU3xoj6rR8EXU$. You are receiving this because you were mentioned.Message ID: @.***>

AdamSchiff commented 1 week ago

Since all new ORCIDs are self-assigned I think (a person has to fill out a form to get one), I do think that ORCIDs are for real persons. They could still be for a pseudonym though probably. I don't really know how the original batch of ORCIDs was generated, whether it was based on a file of names that included fictitious characters or some other way.

Adam

Adam L. Schiff Principal Cataloger University of Washington Libraries (206) 543-8409 @.***


From: pennylenger @.> Sent: Thursday, September 12, 2024 10:43 AM To: uwlib-cams/MARC2RDA @.> Cc: Adam L Schiff @.>; Mention @.> Subject: Re: [uwlib-cams/MARC2RDA] 700 (Issue #15)

This ORCID ID, and three others for the same name 'Mickey Mouse', appear to be deprecated because they return no public information; that is, nothing is returned on de-referencing. The question, then, is does ORCID now confine its scope to real-world persons?

I couldn't find the scope documentation, but ORCID is mainly for researchers, so I think they are a subtype of RDA Person. However, anyone can create a record here, and I can also create a record for Mickey Mouse, so I'm not sure.

— Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https://github.com/uwlib-cams/MARC2RDA/issues/15*issuecomment-2346886056__;Iw!!K-Hz7m0Vt54!jcQXh-2B4L9fMQQQqn-m0-siwiYdrgtBEMkVgEukmPhLhtPN1gwz7ZUMmhh1YCXoKLUgJBOMezG6ZrFFcQRaxPI$, or unsubscribehttps://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/ADFBVB3HSWWQ56NSS6W2XI3ZWHHC3AVCNFSM6AAAAABMPBO5RWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNBWHA4DMMBVGY__;!!K-Hz7m0Vt54!jcQXh-2B4L9fMQQQqn-m0-siwiYdrgtBEMkVgEukmPhLhtPN1gwz7ZUMmhh1YCXoKLUgJBOMezG6ZrFFAwImMKA$. You are receiving this because you were mentioned.Message ID: @.***>

GordonDunsire commented 6 days ago

@CECSpecialistI: ORCID has three entries for 'Sherlock Holmes' and two for 'Donald Duck'. None of them have affiliations and all are set to 'private'. I think it's safe to assume that this situation indicates practical deprecation; that is, if no affiliation and no de-referenced information, then it is not safe to use the IRI in the transform because the referant entity is not an Agent/Person.

I think this should be the case for all 'private' ORCID IRIs because the transform is into linked open data. I guess doing this in-transform involves a look-up and test which will cause unacceptable processing delays, so it might be done as a post-processing operation.

@AdamSchiff: I vaguely remember looking at ORCID shortly after it became available, and carried out the 'Mickey Mouse' test; yup, there was an entry. The first iterations of ORCID were generated from batch data loads, I believe, supplied by national and university libraries, and I think this is how the mouse got in. I guess the ORCID library cat has been mouse-hunting ever since, by setting these early entries to private.

ORCID entries for pseudonyms just add to the mountain of 'name authorities' and presumably will be resolved in due course.

Of course, cataloguers should not be using ORCID ids that have no information but a name (i.e. are 'private'), unless the associated institution is 'trusted'. In more practical terms, which of three Mickey Mouse ids do you select, if there is no further information? ORCID is not an 'authority' file in the normal sense; there is little intermediation in the entries.

CECSpecialistI commented 5 days ago

Since ORCiDs can be created by anyone, I assume anyone can create one for any fictitious character as well. I can create one for Mickey Mouse right now and not set it as private, and assert that he graduated with a PhD in electrical engineering from the University of Washington.

CECSpecialistI commented 5 days ago

I just made this one for Donald Duck, the astronaut at NASA who got a PhD from UW in Astrobiology: https://orcid.org/0009-0008-0777-1939

GordonDunsire commented 4 days ago

Presumably the astronaut explores the Donald Duck universe ;-)

CECSpecialistI commented 4 days ago

Ha! In all seriousness though, I don't know if these types of ORCiDs will make it into MARC records, but it's possible for pseudonyms or fictitious characters I suppose (even though they are extreme minorities in the case of ORCiDs and are not a preferred source of these types of entities for any cataloging community I know of). I did learn something through this process, though. What I've done with Donald Duck and what others have done with Sherlock Holmes/Mickey Mouse is really not supposed to be done--ORCiD is supposed to be used by researchers to represent themselves as human beings/creators of written materials. I had to click through a menu verifying that I'm a person. So, is extremely easy potential abuse of a registry grounds for excluding it? In that case, Wikidata may fall into the same category?

CECSpecialistI commented 4 days ago

(I'm going to go in and delete this Donald Duck example after this week)

GordonDunsire commented 2 days ago

@CECSpecialistI: This is a feature (not a bug) of linked open data; it is the AAA principle that 'anyone can say anything about any thing'. That's why data provenance is so important in lod applications. Applications that use metadata from an open source should 'know' or check the agent that created it, and under what circumstances (content standard, etc.) it was created. This, of course, is a significant consideration in all implementation scenarios, as demonstrated by the 'crap' records that we have recently uncovered. WD seems to have appropriate provenance built in, but applications will have to figure out how to identify and use it.

Some factors that applications should consider when 'pulling in' an open data triple (which is the whole point) are trustworthiness of the creator agency, currency or compatibility of content standard used, compatibility of property/class used, and language/script of the object value string.

A goal of the project should be to maximise the trustworthiness of the creator agency to encourage widespread and persistent use of the data triples produced by the transform. Note that minor errors in string values have far less impact on application trustworthiness than these other factors.