PerseusDL / canonical-greekLit

XML Canonical resources for Greek Literature
https://scaife.perseus.org
Creative Commons Attribution Share Alike 4.0 International
100 stars 93 forks source link

(tlg0061) error(s) in cts metadata and file headers #1501

Open lcerrato opened 12 months ago

lcerrato commented 12 months ago

There are several errors here. Created via https://github.com/PerseusDL/canonical-greekLit/pull/1496

<ti:description xml:lang="eng">Selections from Lucian. Smith, Emily James, translators. New York; Harper Brothers, 1892.</ti:description>

The set needs review. This should read: <ti:description xml:lang="eng">Lucian. Selections from Lucian. Smith, Emily James, translator. New York: Harper Brothers, 1892.</ti:description>

lcerrato commented 12 months ago

further the files themselves are missing crucial header data.

The title is unnecessarily tagged as "eng" twice <title xml:lang="eng">

Header credits should include all Perseus staff as we all review and update these new files.

   <principal>Gregory Crane</principal>
             <respStmt>
                 <resp>Prepared under the supervision of</resp>
                 <name>Alison Babeu</name>
                 <name>Lisa Cerrato</name>
             </respStmt>

The Hathi Trust is incorrect. It is HathiTrust no space, no "the"

It appears there are straight quotes rather than curly in the body of the work and no <q> tags.

lcerrato commented 10 months ago

Also has labels instead of speakers/sp tags which I believe are in the Greek.

AlisonBabeu commented 10 months ago

I'm not entirely sure how you fix that last part and I'm not on the Pseudo-Lucian yet, but I think I will merge the tlg0062 pull request. And then start a separate one to fix all of the issues with Smith in terms of cts and headers.

lcerrato commented 1 month ago

Note that there are stray characters and bad punctuation as well as missing curly quotes and q tags, etc.