Open arademaker opened 1 year ago
@inproceedings{branco-silva-2004-evaluating,
title = "Evaluating Solutions for the Rapid Development of State-of-the-Art {POS} Taggers for {P}ortuguese",
author = "Branco, Ant{\'o}nio and
Silva, Jo{\~a}o",
booktitle = "Proceedings of the Fourth International Conference on Language Resources and Evaluation ({LREC}{'}04)",
month = may,
year = "2004",
address = "Lisbon, Portugal",
publisher = "European Language Resources Association (ELRA)",
url = "http://www.lrec-conf.org/proceedings/lrec2004/pdf/572.pdf",
}
The following authors encode prepositional articles in the lexicon, i.e., these forms are not split in tokenization:
ALENCAR, Leonel Figueiredo de; SCHWARZE, Christoph. French de and en as expressions of the genitive case: a unified analysis within LFG and computational implementation in XLE. D.E.L.T.A., 37-1, 2021 (1-49). https://doi.org/10.1590/1678-460X2021370104
FRANK, A. Eine LFG-Grammatik des Französischen. In: BERMAN, J.; FRANK, A. Deutsche und französische Syntax im Formalismus der LFG. Tübingen: Niemeyer, 1996. p.97-244.
SCHWARZE, C.; ALENCAR, L. F. de. Lexikalisch-funktionale Grammatik: eine Einführung am Beispiel des Französischen mit computerlinguistischer Implementierung. Tübingen: Stauffenburg, 2016.
Prepositions em and de also contract with demonstrative pronouns and demonstrative adverbs: deste, neste, daqui etc. Preposition para contracts with the article in colloquial speech: pros (para os) etc. @arademaker , if we focus on parsing, I would split these elements in our grammar. If we don't split them, profound changes must be done by hand on the syntax. It's an intellectual challenge, it might be interesting to face it. But does it pay off? @danflick?
How to deal with the contractions?! @leoalenc reported two approaches in the literature. Maybe he can add pointers here.
preposition + article
deixamos os livros nas [= em as] prateleiras deixamos os livros em casa
pronoun
compra-nos um livro nos compraram um livro