UniversalDependencies / UD_Turkish-Kenet

Other
5 stars 0 forks source link

Summary

Turkish-Kenet UD Treebank is the biggest treebank of Turkish. It consists of 18,700 manually annotated sentences and 178,700 tokens. Its corpus consists of dictionary examples.

Introduction

This treebank is fully manually annotated and it includes 18,700 manually annotated sentences and 178,700 tokens. The sentences are taken from the Turkish wordnet Kenet, which includes word definitions from the example sentences of the dictionary of the Turkish Language Association. The domain is general. This is because the dictionary examples include sentences from novels, daily speech, and some amount of poem lines. It includes 9,350 test and 9,350 training sentences.

Acknowledgments

We wish to thank all the contributors and the Starlang Software for funding and supporting this work.

Changelog

=== Machine-readable metadata (DO NOT REMOVE!) ================================
Data available since: UD v2.8
License: CC BY-SA 4.0
Includes text: yes
Genre: grammar-examples
Lemmas: converted from manual
UPOS: converted from manual
XPOS: converted from manual
Features: converted from manual
Relations: converted from manual
Contributors: Kuzgun, Aslı; Cesur, Neslihan; Yıldız, Olcay Taner; Kuyrukçu, Oğuzhan; Yenice, Arife Betül; Arıcan, Bilge Nas; Sanıyar, Ezgi
Contributing: elsewhere
Contact: kuzgunasli@gmail.com / olcay.yildiz@ozyegin.edu.tr 
===============================================================================