SunoikisisDC / SunoikisisDC-2020-2021

Sunoikisis Digital Classics 2020–2021 syllabuses
19 stars 2 forks source link

Discussion of Dell’Oro 2020 & Vierros 2018 #13

Closed gabrielbodard closed 3 years ago

gabrielbodard commented 3 years ago

With both of these, please think about the aims and research questions behind the tools and methods discussed, rather than the technology and implementation.

chiaradimaio commented 3 years ago

The article by Francesca Dell'Oro describes the project called A World of Possibilities [WoPoss], aimed at tracking the evolution of modal meanings in the Latin language.

The writer explains how this analysis is carried out towards three steps:

1) Modals meaning 'necessity', 'possibility' and 'volition' in Latin are first collected from a diachronic corpus that ranges from 3rd century BCE to 7th century CE, including literary and documentary texts from different Latin-speaking regions of the ancient world. These selected texts are checked and confirmed to be philologically correct, so that they can be reused under a creative commons licence. Then, all text files are converted to plain text, but important structural information is kept (thanks to the so-called pseudo-markup)

2) The tool INCEpTION (a multi-modular annotation platform) is customized and adapted to the needs of this project: expressing the modal marker, its scope and their relation. Then the WoPoss team carries on with manual annotation, which is particularly useful in cases of ambiguity, since the description of passages could allow future users to notice semantic shift.

3) The annotated files are exported in XMI and transformed according to the TEI standards. Multiple layers of linguistic annotation include: most ancient meaning of each modal marker; transformation of the pseudo-markup into the correspondent TEI elements; addition of metadata to each text, concerning chronology, genre, transmission, authorship.

The resulting TEI dataset will be freely accessible through a user-friendly interface. The whole WoPoss project is an open science product, stored in an open GitHub repository.

HLBallard44 commented 3 years ago

Vierros, M. “Linguistic Annotation of the Digital Papyrological Corpus: Sematia.” This article focuses on a developing digital papyrological corpus called Sematia and its selected approach.

Corpus Design

How to Annotate Papyri

Metadata and Its Purpose

The goals for Sematia are to have the whole papyrological corpus available, phonological searches, and an automatic morphological parser for Greek.

nicolealexandra33 commented 3 years ago

It would be interesting to see how Sematia does with the PGM especially as multiple languages are used (although as I understand it, Latin and Greek are currently the ones that the software can process). It could maybe help determine which spells were also translated or taken from another linguistic tradition despite the entry being in a different language

chiaradimaio commented 3 years ago

With regard to the background of the authors, it is worth saying that Francesca dell'Oro is currently teaching linguistics, but has a background as a classical philologist, while Marja Vierros is a classical philologist and a papyrologist. Both of them are mainly interested in the historical and developmental aspect of ancient languages. Their tools are highly customized and specialised, but they have the same target: offering to specialists reliable corpora that enable visualizing information about the diachrony of certain linguistic phenomena.