This is a legacy repository for the STB subcorpora of the Nanyang Technological University - Multilingual Corpus (NTU-MC) project. New editions of NTU-MC are maintained by NTU Computational Linguistics Lab
Please cite the following when using the data/scripts from the NTU-MC:
author = {Liling Tan and
Francis Bond},
title = {Building and Annotating the Linguistically Diverse NTU-MC
(NTU-Multilingual Corpus)},
booktitle = {PACLIC},
year = {2011},
pages = {362-371},
ee = {},
Liling Tan. 2011. Building the foundation text for Nanyang Technological University - Multilingual Corpus (NTU-MC).. Bachelor Final Year Project. Nanyang Technological University: Singapore.
Liling Tan and Francis Bond. 2012. Building and annotating the linguistically diverse NTU-MC (NTU-multilingual corpus). International Journal of Asian Language Processing, 22(4):161–174
Liling Tan and Francis Bond. 2014. NTU-MC Toolkit: Annotating a Linguistically Diverse Corpus. In Proceedings of 25th International Conference on Computational Linguistics (COLING 2014). Dublin, Ireland.
Other References: