LR-POR / PorGram

A Portuguese HPSG Grammar
Other
2 stars 1 forks source link

contractions #86

Open arademaker opened 1 year ago

arademaker commented 1 year ago

How to deal with the contractions?! @leoalenc reported two approaches in the literature. Maybe he can add pointers here.

preposition + article

deixamos os livros nas [= em as] prateleiras deixamos os livros em casa

pronoun

compra-nos um livro nos compraram um livro

leoalenc commented 1 year ago
@inproceedings{branco-silva-2004-evaluating,
    title = "Evaluating Solutions for the Rapid Development of State-of-the-Art {POS} Taggers for {P}ortuguese",
    author = "Branco, Ant{\'o}nio  and
      Silva, Jo{\~a}o",
    booktitle = "Proceedings of the Fourth International Conference on Language Resources and Evaluation ({LREC}{'}04)",
    month = may,
    year = "2004",
    address = "Lisbon, Portugal",
    publisher = "European Language Resources Association (ELRA)",
    url = "http://www.lrec-conf.org/proceedings/lrec2004/pdf/572.pdf",
}
leoalenc commented 1 year ago

The following authors encode prepositional articles in the lexicon, i.e., these forms are not split in tokenization:

ALENCAR, Leonel Figueiredo de; SCHWARZE, Christoph. French de and en as expressions of the genitive case: a unified analysis within LFG and computational implementation in XLE. D.E.L.T.A., 37-1, 2021 (1-49). https://doi.org/10.1590/1678-460X2021370104

FRANK, A. Eine LFG-Grammatik des Französischen. In: BERMAN, J.; FRANK, A. Deutsche und französische Syntax im Formalismus der LFG. Tübingen: Niemeyer, 1996. p.97-244.

SCHWARZE, C.; ALENCAR, L. F. de. Lexikalisch-funktionale Grammatik: eine Einführung am Beispiel des Französischen mit computerlinguistischer Implementierung. Tübingen: Stauffenburg, 2016.

leoalenc commented 1 year ago

Prepositions em and de also contract with demonstrative pronouns and demonstrative adverbs: deste, neste, daqui etc. Preposition para contracts with the article in colloquial speech: pros (para os) etc. @arademaker , if we focus on parsing, I would split these elements in our grammar. If we don't split them, profound changes must be done by hand on the syntax. It's an intellectual challenge, it might be interesting to face it. But does it pay off? @danflick?