UniversalDependencies / UD_Portuguese-PUD

Parallel Universal Dependencies.
5 stars 3 forks source link

loan words #47

Open vcvpaiva opened 3 years ago

vcvpaiva commented 3 years ago

Two small questions in the sentence below.

  1. timings is definitely a loan word, but "exato" is not. why is it called "X" too? a typo?
  2. 'site' should be a loan word too, the word in Portuguese is "sitio", but it is not used for websites, which everyone calls 'sites', with the English pronunciation.

    sent_id = n01147085 text = O site da Yas Marina Circuit tem timings exatos. texten = The Yas Marina Circuit website has exact timings. 1 O o DET DT Gender=Masc|Number=Sing 2 det 2 site site NOUN NN Gender=Masc|Number=Sing 8 nsubj 3-4 da 3 de de ADP INDT 5 case 4 a o DET Gender=Fem|Number=Sing 5 det 5 Yas Yas PROPN NNP Gender=Fem|Number=Sing 2 nmod 6 Marina Marina PROPN NNP Foreign=Yes|Gender=Fem|Number=Sing 5 flat 7 Circuit Circuit PROPN NNP Foreign=Yes|Gender=Fem|Number=Sing 5 flat 8 tem ter VERB VBC Mood=Ind|Number=Sing|Person=3|Tense=Pres 0 root 9 timings NOUN X Gender=Masc|Number=Plur 8 obj LoanWord=True 10 exatos exato ADJ X Gender=Masc|Number=Plur 9 amod SpaceAfter=No 11 . . PUNCT . 8 punct _

vcvpaiva commented 3 years ago

More incorrect 'loan words' as far as I understand the concept. Neither 'Grã' nor 'Bretanha' are loan words, but 'Grã' is not considered so, while 'Bretanha' is considered a loan word, and give POS=X when it should be a proper noun, it seems.

sent_id = n02027007 text = De acordo com Parker, agentes do Serviço Secreto Russo estão ativos em grande número na Grã Bretanha. text_en = According to Parker, Russian Secret Service agents are active in large numbers in Great Britain.

Also two mistakes with 'latino americano' (in the sentences below) where latino' is considered adjective, butamericano' is X, when it should be adjective.

newdoc id = n05008 sent_id = n05008012 text = Como resultado, Trump não está muito preocupado sobre o voto latino americano em um nível nacional. text_en = As a result, Trump isn't very worried about the Latin American vote at a national level.

sent_id = n05008018 text = A votação antecipada sugere que desta vez os latino americanos votarão em grande número, mas é incerto se o aumento terá um impacto. text_en = The early voting suggests that this time the Latin Americans will come out to vote in greater numbers, but it is unclear whether the increase will have an impact.