Pittsburgh-NEH-Institute / pr-app

eXist-db app development
MIT License
3 stars 1 forks source link

[IMPEDED] Publisher markup is incorrect and inconsistent #38

Closed djbpitt closed 2 years ago

djbpitt commented 2 years ago

Three articles list more than one publisher:

<publisher-info xmlns="http://www.tei-c.org/ns/1.0">
    <article>
        <title>Thoughts On Seeing Ghosts</title>
        <xml:id>GH-GNCCO-18410227</xml:id>
        <publicationStmt>
            <publisher>The Odd Fellow</publisher>
            <publisher>Yankee Notions</publisher>
            <date when="1841-02-27"/>
            <idno>GH-GNCCO-18410227</idno>
         </publicationStmt>
    </article>
    <article>
        <title>The New Hammersmith Ghost</title>
        <xml:id>GH-GNCCO-18470904</xml:id>
        <publicationStmt>
                <publisher rendition="original">Sunderland Herald</publisher>
                <publisher rendition="reprint">Douglas Jerrold's Weekly Newspaper</publisher>
                <date when="1847-09-04"/>
                <idno>GH-GNCCO-18470904</idno>
            </publicationStmt>
    </article>
    <article>
        <title>A Substantial Ghost Story</title>
        <xml:id>GH-GNCCO-18750703</xml:id>
        <publicationStmt>
            <publisher>The People’s Advocate</publisher>
            <publisher>Altrincham Guardian</publisher>
            <date when="1875-07-03"/>
            <idno>GH-GNCCO-18750703</idno>
         </publicationStmt>
    </article>
</publisher-info>

I think I understand the one in the middle, which distinguishes the original publication venue from the reprint, although I don't know whether the date refers to the original or the reprint. But the other two have multiple <publisher> elements with no distinguishing attribute values.

Because we construct facets on <publisher> values, articles with two <publisher> elements are accessible when we search for either of the values. This is probably as it should be. But when it comes time to list the articles returned as search results, how should they be formatted? First publisher? Both? Labeled as "original" and "reprint" or without comment?

djbpitt commented 2 years ago

Facet should show only reprint publisher, as should rendering.

djbpitt commented 2 years ago

Per conversation on TEI-L 2022-06-03 and following, publication information for the source document goes in <sourceDesc>, and not in <publicationStmt>. Thus:

<publicationStmt> might be one of the most frequently misused teiHeader items, because (unless I am mistaken), it is supposed to contain the publication data of the digital edition, not of the source it transcribes - the data about the source (in this case the newspaper article) should be in the <sourceDesc> where you should in this case list both the newspaper article in its reprint as separate <bib> items with all their possible metadata including date, publisher, language, etc., And you can then add the additional specification that the first is a reprint of the second…. (Maarten Janssen, TEI-L, 2022-06-03)

djbpitt commented 2 years ago

Fixed