anacastrosalgado / TEI

1 stars 0 forks source link

Morais TEI header (1st edition) #4

Open anacastrosalgado opened 2 years ago

anacastrosalgado commented 2 years ago

The MORAIS dictionary encoding starts with the element, in which the metadata of the encoded text is structurally stored, that is, the detailed bibliographic data from both the printed source(s) and the electronic file is described in order to improve search engines. Applying TEI Lex-0 specification (version: 0.9.1), the MORAIS dictionary TEI header consists of 1) a file description (<fileDesc>), which is mandatory and presents the complete bibliographic description of the machine-readable resource and of the analogue original source; 2) an encoding description (<encodingDesc>), where we add information about some important principles and decisions taken during the encoding and where the taxonomy of domain labels is described; 3) a profile description, (<profileDesc>), where we specify the object and working languages. Is important to note that the first edition of MORAIS has two volumes (A – K and L – Z), thus, each volume has its own <biblStruct> element in the file description/source description.

<?xml version="1.0" encoding="UTF-8"?> 
<?xml-model href="https://raw.githubusercontent.com/DARIAH-ERIC/lexicalresources/master/Schemas/TEILex0/out/TEILex0.rng" 
    type="application/xml" schematypens="http://relaxng.org/ns/structure/1.0"?>
<?xml-model href="http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng" 
    type="application/xml" schematypens="http://purl.oclc.org/dsdl/schematron"?>
<TEI xmlns="http://www.tei-c.org/ns/1.0">
   <teiHeader>
      <fileDesc>
         <titleStmt>
            <title xml:lang="en" type="full">Morais Dictionary (1st ed., 1789): XML encoding</title>
         </titleStmt>
         <editionStmt>  
            <edition>MorDigital Project (PTDC/LLT-LIN/6841/2020)</edition>
            <respStmt>
               <resp>Principal researcher</resp>
               <persName>Rute Costa</persName>
               <orgName>NOVA CLUNL, Centro de Linguística da Universidade NOVA de Lisboa</orgName>
            </respStmt>
            <respStmt>
               <resp>Team</resp>
               <persName>Rute Costa</persName>
               <orgName>NOVA CLUNL, Centro de Linguística da Universidade NOVA de Lisboa</orgName>
               <persName>Sara Carvalho</persName>
               <orgName n="1">NOVA CLUNL, Centro de Linguística da Universidade NOVA de Lisboa</orgName>
               <orgName n="2">CLLC, Centro de Línguas, Literaturas e Culturas da Universidade de Aveiro</orgName>
               <persName>Ana Salgado</persName>
               <orgName n="1">NOVA CLUNL, Centro de Linguística da Universidade NOVA de Lisboa</orgName>
               <orgName n="2">Academia das Ciências de Lisboa, Instituto de Lexicologia e Lexicografia da Língua Portuguesa</orgName>
               <persName>Bruno Almeida</persName>
               <orgName>NOVA CLUNL, Centro de Linguística da Universidade NOVA de Lisboa</orgName>
               <persName>Margarida Ramos</persName>
               <orgName>NOVA CLUNL, Centro de Linguística da Universidade NOVA de Lisboa</orgName>
               <persName>Raquel Silva</persName>
               <orgName n="1">NOVA CLUNL, Centro de Linguística da Universidade NOVA de Lisboa</orgName>
               <orgName n="2">VOH.CoLAB, Value for Health CoLAB</orgName>
               <persName>Alexandre Carreira</persName>
               <orgName>NOVA CLUNL, Centro de Linguística da Universidade NOVA de Lisboa</orgName>
               <persName>Joana Oliveira</persName>
               <orgName>NOVA CLUNL, Centro de Linguística da Universidade NOVA de Lisboa</orgName>
               <persName>Fahad Khan</persName>
               <orgName>Istituto Di Linguistica Computazionale ‘A. Zampolli’</orgName>
               <persName>Laurent Romary</persName>
               <orgName>Inria-ALMAnaCH Lab</orgName>
               <persName>Mohamed Khemakhem</persName>
               <orgName>Inria-ALMAnaCH Lab + Université Grenoble Alpes</orgName>
               <persName>Toma Tasovac</persName>
               <orgName n="1">DARIAH-EU, Digital Research Infrastructure for the Arts and Humanities</orgName>
               <orgName n="2">BCDH, Belgrade Center for Digital Humanities</orgName>
            </respStmt>
            <respStmt>
               <resp>Consultants</resp>
               <persName>Maria Filomena Gonçalves</persName>
               <persName>Jorge Gracia</persName>
            </respStmt>
            <respStmt>
               <resp>OCR tasks done by</resp>
               <persName>Alexandre Carreira</persName>
               <persName>Margarida Ramos</persName>
               <persName>Joana Oliveira</persName>
            </respStmt>
            <respStmt>
               <resp>XML encoding by</resp>
               <persName>Ana Salgado</persName>
               <persName>Bruno Almeida</persName>
               <persName>Toma Tasovac</persName>
            </respStmt>
         </editionStmt>
         <publicationStmt>
            <authority role="sponsor">FCT – Fundação para a Ciência e Tecnologia</authority>
            <pubPlace>Lisboa</pubPlace>
            <date>2021-2023</date>
            <availability>
               <licence target="https://creativecommons.org/licenses/by/4.0/">
                  <p>Creative Commons Attribution 4.0 International (CC BY 4.0)</p>
               </licence>
            </availability>
         </publicationStmt>
         <!-- [...] -->
         <sourceDesc>
            <biblStruct>
               <monogr>
                  <title level="m" type="main">Diccionario da lingua portugueza composto pelo padre
                     D. Rafael Bluteau, reformado, e accrescentado por Antonio de Moraes Silva,
                     natural do Rio de Janeiro</title>
                  <title level="m" type="sub">A – K</title>
                  <author>
                     <persName>
                        <forename>António de</forename>
                        <surname>Morais Silva</surname>
                     </persName>
                  </author>
                  <imprint>
                     <pubPlace>Lisboa</pubPlace>
                     <publisher>Officina de Simão Thaddeo Ferreira</publisher>
                     <pubPlace>Lisboa</pubPlace>
                     <date>1789</date>
                     <note>Com Licença da Real Meza da Comissão Geral, sobre o Exame, e Censura dos
                        Livros.</note>
                     <!-- Please confirm distributor: describes the store where the dictionary was for sale. -->
                     <distributor>Vende-ſe na loja de Borel Borel, e Companhia, quaſi defronte da
                        Igreja nova de Noſſa Senhora dos Martyres, na eſquina.</distributor>
                  </imprint>
                  <extent>Tomo primeiro</extent>
                  <extent>752 pp.</extent>
               </monogr>
            </biblStruct>
            <biblStruct>
               <monogr>
                  <title level="m" type="main">Diccionario da lingua portugueza composto pelo padre
                     D. Rafael Bluteau, reformado, e accrescentado por Antonio de Moraes Silva,
                     natural do Rio de Janeiro</title>
                  <title level="m" type="sub">L – Z</title>
                  <author>
                     <persName>
                        <forename>António de</forename>
                        <surname>Morais Silva</surname>
                     </persName>
                  </author>
                  <imprint>
                     <publisher>Officina de Simão Thaddeo Ferreira</publisher>
                     <pubPlace>Lisboa</pubPlace>
                     <date>1789</date>
                  </imprint>
                  <extent>Tomo segundo</extent>
                  <extent>541 pp.</extent>
               </monogr>
            </biblStruct>
         </sourceDesc>
      </fileDesc>
      <encodingDesc>
         <projectDesc>
            <!-- In progress. -->
            <p>We already had access to OCR’ed versions of the dictionary editions at the beginning
               of the project. These files needed to be post-corrected. For this, we decided to use
               ABBYY FineReader.</p>
         </projectDesc>
         <editorialDecl>
            <!-- In progress. -->
            <p>Original spelling and typography is retained.</p>
            <p>Errors found in original OCR were all controlled.</p>
         </editorialDecl>
         <!-- Hierarchical usage labels: includes only Medicine domain label -->
         <classDecl>
            <taxonomy xml:id="domain">
               <category xml:id="domain.medical_and_health_sciences">
                  <catDesc xml:lang="en">Medical_and Health Sciences</catDesc>
                  <catDesc xml:lang="pt">Ciências Médicas e da Saúde</catDesc>
                  <category xml:id="domain.medical_and_health_sciences.medicine">
                     <!-- In MORAIS, Med./(t.)Medico -->
                     <catDesc xml:lang="en">Medicine</catDesc>
                     <catDesc xml:lang="pt">Medicina</catDesc>
                  </category>
               </category>
            </taxonomy>
         </classDecl>
      </encodingDesc>
      <profileDesc>
         <langUsage>
            <language role="objectLanguage" ident="pt">Portuguese</language>
            <language role="workingLanguage" ident="en">English</language>
         </langUsage>
      </profileDesc>
   </teiHeader>
   <text>
      <body>
         <!-- Different sections start here -->
         <div type="section" n="1">
            <p>Foi taxado eſte Livro em papel a dous mil reis. Meza 8 de Junho de 1789.</p>
            <p><hi rend="italic">Com tres rubricas.</hi></p>
         </div>
         <!-- [...] -->
      </body>
   </text>
</TEI>
laurentromary commented 2 years ago

"XML encoding" is alite bit down to earth => "digital edition"?

anacastrosalgado commented 2 years ago

Dear @laurentromary, if I understand well, your suggestion is to change this line:

Morais Dictionary (1st ed., 1789): digital edition

XML encoding is very reductive.