NYPL / catalog_of_copyright_entries_project

NYPL Project to transcribe and parse pages from the US Catalog of Copyright Entries
Creative Commons Zero v1.0 Universal
58 stars 13 forks source link

Do we need to deal with Author headings #16

Closed seanredmond closed 6 years ago

seanredmond commented 6 years ago

Sometimes the names under which the entries appear carry some extra information. For instance (example from DCL):

Cuthbert, Margaret,* New York.

Adventure in radio,
edited by M. Cuthbert, with radio scripts by Edna St. Vincent Millay, Arch Obeler, Archibald MacLeish [and others]

© Sept. 17, 1945; A 189950.

The asterisk means the Margaret Cuthbert is the claimant. There is also the info that she is from New York. Is this used for disambiguation?

Another example:

Curtis, Charles P., jr.,* Ipswich, Mass. & Greenslet, Ferris,* Boston.

The practical cogitator, selected and edited by C. P. Curtis, jr. and Ferris Greenslet.

© Oct. 9, 1945; A 190420.

If we do not include the author headings in the XML, then we still need to mark the authors as claimants in the entry, i.e.

<author claimaint="true"><role>edited by</role>  <authorName>M. Cuthbert</authorName></author>

However, the principles stated in #9 argue for recording the heading, and we would then need to group the entries under the heading. Something like:

<entryGroup>
    <heading>
        <author><authorName claimant="true">Cuthbert, Margaret</authorName>,* <authorPlace>New York</authorPlace>.</author>
    </heading>

    <copyrightEntry>...</copyrightEntry>
    <copyrightEntry>...</copyrightEntry>
</entryGroup>

If we do this, I think it's still a good idea to mark the claimants within the <copyrightEntries> themselves. Since the forms of the names don't match (at least in these examples) I don't know how hard that will be.