Discourse parsing with RDF/LOD technology
Discourse parsing is concerned with understanding the hierarchical and relational structure of utterances in a text. Discourse semantics complement semantic parsing by capturing context dependencies and allowing to aggregate information over multiple sentences.
This involves several dimensions:
- discourse structure (hierarchical organization of the discourse)
- discourse relations (coherence relations between multiple discourse segments)
- discourse markers (explicit cues for discourse relations)
- frame semantics (implicit semantic roles)
- anaphora (coreference, bridging, event anaphora)
- etc.
So far, we've been focusing on
- interoperability: using RDF technology and ontologies to facilitate the interoperability between annotation schemas for discourse, coreference and information structure (OLiA Discourse Extensions, in the OLiA repository) (Chiarcos 2014)
- implicit discourse relations: discourse parsing of implicit relations (Rönnqvist, Schenk and Chiarcos 2017, in separate repository)
- interlinked discourse marker inventories: using RDF technology and the OLiA Discourse Extensions for creating discourse marker inventories that are linked across languages and theoretical frameworks (discourse marker inventories in OntoLex-Lemon) (Chiarcos and Ionov 2021)
- cross-lingual induction of discourse marker inventories over lexical resources, see
lexical-induction/
(Chiarcos 2022)
Research on discourse semantics which do not directly depend on either RDF or LOD technology is being dealt with in separate ACoLi repositories.
Contents
discourse-markers/
: OntoLex-Lemon edition of multilingual discourse marker inventories in RDF and as Linked Data
lexical-induction/
: Experiments on inducing multilingual discourse marker inventory stubs from discourse-markers/
and lexical knowledge graphs
In preparation:
- Experiments on inducing multilingual discourse marker inventory stubs from
discourse-markers/
and parallel corpora
- Experiments on inducing multilingual discourse marker annotations from
discourse-markers/
and parallel corpora
References
- Christian Chiarcos (2014), Towards interoperable discourse annotation. Discourse features in the Ontologies of Linguistic Annotation, In: Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14), May 2014, Reykjavik, Iceland, European Language Resources Association (ELRA)
- Samuel Rönnqvist, Niko Schenk, and Christian Chiarcos (2017), A Recurrent Neural Model with Attention for the Recognition of Chinese Implicit Discourse Relations, In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (ACL-2017), July 2017, Vancouver, Canada, Association for Computational Linguistics
- Christian Chiarcos and Maxim Ionov (2021), Linking Discourse Marker Inventories. In: Proceedings of the Third Conference on Language, Data and Knowledge (LDK 2021), Sep 2021, Zaragoza, Spain, Schloss Dagstuhl -- Leibniz-Zentrum für Informatik, Open Access Series in Informatics (OASIcs), vol. 93
- Christian Chiarcos (2022/accepted), Inducing Discourse Marker Inventories from Lexical Knowledge Graphs. Accepted at LREC-2022, Marseille, France, June 2022.