AUX problem: se tornar = become #49

Open vcvpaiva opened 3 years ago

vcvpaiva commented 3 years ago

NB: It appears that annotators thought that 'aspect' verbs were to be marked as auxiliary verbs. The corresponding verbs in English are not considered AUX.

'tornar-se = become' (infinitive) shouldn't be an auxiliary. I believe this is a mistake.

newdoc id = n01034 sent_id = n01034033 text = Se o Donald Trump se tornar presidente, o governo daqui ainda terá de trabalhar com ele para avançar com alguma agenda partilhada que exista, para garantir que os negócios e os interesses canadenses estão representados em Washington. text_en = "If Donald Trump becomes president, the government here will still have to work with him to advance whatever shared agenda there is, to ensure that Canadian businesses and interests are represented in Washington."

Second example: newdoc id = n01076 sent_id = n01076006 text = Uma vitória de Donald Trump imediatamente tornaria o mundo mais preocupante e inseguro do que ele já é. text_en = A Donald Trump victory would immediately make the world more worrying and unsettled than it already is.

tornar is not an auxiliary.

Third example: sent_id = w01028069 text = A reação nacional aos eventos do Kansas demonstrou quão profundamente dividido tinha se tornado o país. text_en = National reaction to the events in Kansas demonstrated how deeply divided the country had become.

Fourth example: sent_id = w01050070 text = Bogd Khaan disse que tanto a Mongólia como a China tinham sido administradas pelos manchus durante os Qing, e após a queda da dinastia Qing em 1911, o contrato de submissão mongol aos manchus tinha se tornado inválido. text_en = Bogd Khaan said that both Mongolia and China had been administered by the Manchu during the Qing, and after the fall of the Qing dynasty in 1911, the contract of Mongolian submission to the Manchu had become invalid.

Fifth: sent_id = w01129053 text = Em 2003, dominou na classificação geral de pontos e liderou as últimas 33 de 36 corridas, se tornando o campeão da NASCAR Winston Cup 2003, o último piloto a ter esse título. text_en = In 2003, he dominated in the points standings and leading the last 33 of 36 races and became the 2003 NASCAR Winston Cup champion, the last driver to ever hold that title.

vcvpaiva commented 3 years ago

'acabar = end up' is not an auxiliary, I believe.

sent_id = n01045035 text = No julgamento de fraude e quebra de confiança de Duffy, o juiz acabou por determinar que estavam dentro das regras do Senado quando declarou Duffy livre de qualquer acusação. text_en = The judge in Duffy's fraud and breach of trust trial ultimately ruled they were within the Senate's rules when he cleared Duffy of all charges.

A second example: sent_id = w01079077 text = A Espanha tinha expulsado a sua população sefardita em 1492; muitos desses judeus espanhóis trocaram Espanha por Portugal, mas acabaram por ser alvos lá também. text_en = Spain had expelled its Sephardic population in 1492; many of these Spanish Jews left Spain for Portugal but eventually were targeted there as well.

The original expression in English 'were targeted' becomes 'acabaram por ser alvos', not auxiliary.

vcvpaiva commented 3 years ago

the verb 'sentir= to feel' is not an auxiliary, I believe.

sent_id = n01061023 text = Quando estou interpretando ele, eu me sinto poderoso, explicou o personificador de Donald Trump, John Di Domenico, à Slate, no ano passado. text_en = “When I’m playing him, I feel powerful,” the Donald Trump impersonator John Di Domenico explained to Slate last year.

(But also quotes missing!)

vcvpaiva commented 3 years ago

the verb 'passar=spend' is not auxiliary.

sent_id = n01064096 text = Como muitas pessoas que conheço, passei os meses recentes acordado até tarde, lendo as sondagens em terror. text_en = Like many people I know, I’ve spent recent months staying up late, reading polls in terror.

the corresponding verb in the English version is the root (spend).

'polls=sondagens' also does not work for me in Brazilian PT: we say 'pesquisas de opinião', I think.

Another example of 'passar=let go=let be' that is not really auxiliary.

sent_id = n01106015 text = O senhor Hopley acrescentou: "O aumento do risco político não deve passar despercebido". text_en = Ms Hopley added: “The spike in political risk should not go unnoticed.”

vcvpaiva commented 3 years ago

The verb 'parar=stop' is not auxiliary.

newdoc id = n01071 sent_id = n01071009 text = Uma central elétrica a carvão em Badarpur, sudeste de Deli, vai parar de operar por 10 dias, juntamente com os geradores a diesel na cidade. text_en = A coal-fired power station in Badarpur, south-east Delhi, will stop operating for 10 days, along with diesel generators in the city.

in fact it is the root in the English version.

vcvpaiva commented 3 years ago

the verb 'to borrow' is an expression in Portuguese "pedir emprestado" (ask for a loan), but 'pedir= ask for' is not an auxiliary verb.

sent_id = n01084045 text = Esta taxa de 3% também se aplica a titulares de cartão Nectar que pensam em pedir emprestado £15,001-£19,999, durante um período de dois a três anos. text_en = That 3% rate also applies to Nectar cardholders looking to borrow from £15,001-£19,999 over a period of between two and three years.

vcvpaiva commented 3 years ago

The verb 'acabar=end up' is not auxiliary, it is the root of the sentence.

sent_id = n01086016 text = Acaba ouvindo mais intensamente o barulho seguinte e ficando mais irritado quando ele surge. text_en = You end up listening more acutely for the next noise and getting more irritated when it comes.

I believe (but I'm not sure) that the verb 'ficar=become' is not a copular verb either. But it is also annotated as copula in

sent_id = n01091017 text = O seu representante de políticas da natureza, Jeff Knott, declarou: "Eu ficaria espantado se uma proibição ou um licenciamento fosse com base nestes fundamentos". text_en = Their head of nature policy, Jeff Knott, stated: “I’d be amazed if either a ban or licensing was introduced off the back of it”.

in this case the English expression 'I'd be amazed=Eu ficaria espantado' is really a copula, but the Portuguese it is not. 'surpreso' would be better than 'espantado', I think.

vcvpaiva commented 3 years ago

The verb 'continuar=to continue=to keep' is not an auxiliary verb.

newdoc id = n01086 sent_id = n01086013 text = O mundo pode estar enfurecido e absurdo - no entanto pelo menos alguém tem dignidade para continuar a protestar contra esse fato. text_en = The world may be enraging and absurd – yet at least someone has the self-respect to keep protesting against that fact.

A second example of 'continuar' annotated as auxiliary: newdoc id = n01090 sent_id = n01090004 text = Olhei para o motocross e quanto mais olhava, continuava a surgir a cara desta mulher, em fotografias que pareciam ser dos anos 70. text_en = I looked at motocross and the more I looked, this one woman’s face kept coming up, in photographs that looked as if they were from the 1970s.

A third example: sent_id = n01144023 text = Ele continuou aparecendo no meu celeiro e eu brinquei com o cara dizendo que, claramente, ele não queria ser vendido. text_en = He kept on turning up at my barn and I joked to the guy that clearly he didn't want to be sold.

Third example: sent_id = w01105057 text = Foi tão bem sucedido que os militares continuaram a usá-lo durante muitos anos após a guerra e ainda estava em uso em alguns países na década de 1980. text_en = It was so successful that the military continued to use it for many years after the war, and it was still in use in some countries in the 1980s.

vcvpaiva commented 3 years ago

The verb 'considerar=consider' is not a copular verb

newdoc id = n01099 sent_id = n01099035 text = Depois, procure no mercado matinal (6:30 - 10:00): pirulitos de arroz, casulos de vespa (as pupas são consideradas um petisco), asas de frango, casca e folhas de moscadeira, sapos vivos e bagre. text_en = Afterwards, browse at the morning market (6.30-10am): rice lollipops, wasp cocoons (the pupae are considered a delicacy), buffalo lung, betel-nut bark and leaves, live toads and catfish.

also very surprising to see asas de frango=chicken wings=buffalo lung???

A second example: newdoc id = w01027 sent_id = w01027007 text = O clima é tão seco que estas planícies são por vezes consideradas parte do Saara. text_en = The climate is so dry that these plains are sometimes thought of as part of the Sahara.

Third: newdoc id = w01063 sent_id = w01063048 text = Os deuses representavam tipicamente as necessidades práticas da vida quotidiana, e eram escrupulosamente concedidos os ritos e oferendas considerados apropriados. text_en = The gods represented distinctly the practical needs of daily life, and they were scrupulously accorded the rites and offerings considered proper.

vcvpaiva commented 3 years ago

The verb 'começar=to begin' is not an auxiliary verb, I believe.

newdoc id = n01108 sent_id = n01108003 text = As empresas esperavam começar a decrescer em julho, imediatamente após o voto do Brexit, mas em vez disso têm conseguido manter o crescimento constante. text_en = Businesses had expected to start contracting in July, immediately after the Brexit vote, but instead have managed to keep growing steadily.

Also in the same sentence the verb 'manter=to keep' is not an auxiliar.

Another example:

sent_id = n01137010 text = Eu não tinha assistido muitos dos episódios e, então, o meu telefone começou a acender. text_en = I hadn't seen a lot of the episodes and then my phone started lighting up. (also telefones in Brazil don't light up, they ring=tocar)

A third example: sent_id = n01141020 text = O movimento destacou o desejo da empresa de que os usuários comecem a pensar nos seus produtos como algo mais do que apenas ferramentas de produtividade. text_en = The move highlighted the company's desire for users to start thinking of its products as more than just productivity tools.

Fourth: sent_id = w01009017 text = Enquanto a identificação de massas de ar foi originalmente usada nas previsões meteorológicas durante a década de 1950, os meteorologistas começaram a estabelecer climatologias sinóticas baseadas nessa ideia em 1973. text_en = While air mass identification was originally used in weather forecasting during the 1950s, climatologists began to establish synoptic climatologies based on this idea in 1973.

Fifth example: sent_id = w01040102 text = Enquanto um vento constante começa a soprar, partículas finas que estão no chão exposto começam a vibrar. text_en = As a steady wind begins to blow, fine particles lying on the exposed ground begin to vibrate.

Sixth: newdoc id = w01048 sent_id = w01048027 text = Na década de 1350, o rei Gongmin estava finalmente livre para reformar o governo de Goryeo quando a dinastia Yuan começou a desmoronar. text_en = In the 1350s, King Gongmin was free at last to reform the Goryeo government when the Yuan dynasty began to crumble.

Seventh: sent_id = w01131076 text = Stalin tinha começado a encorajar Abakumov a formar sua própria rede dentro do MGB para combater o domínio de Beria dos ministérios do poder. text_en = Stalin had begun to encourage Abakumov to form his own network inside the MGB to counter Beria's dominance of the power ministries.

newdoc id = w01133 sent_id = w01133014 text = Starlin assumiu o tema seguinte como argumentista, e começou a desenvolver um elaborado arco de histórias centrado no vilão Thanos, que se estendeu a uma série de títulos Marvel. text_en = Starlin took over as plotter the following issue, and began developing an elaborate story arc centered on the villainous Thanos, and spread across a number of Marvel titles.

sent_id = w01135038 text = Em janeiro de 2011, Blunt começou a gravar um filme americano de ficção científica, Looper, realizado por Rian Johnson e coprotagonizado por Bruce Willis e Joseph Gordon-Levitt; o filme foi lançado em setembro de 2012. text_en = In January 2011, Blunt began filming an American science-fiction film, Looper, directed by Rian Johnson and co-starring Bruce Willis and Joseph Gordon-Levitt; the film was released in September 2012.

vcvpaiva commented 3 years ago

The verb 'voltar=come back' is not auxiliary nor cop, as this analysis seems to indicate.

sent_id = n01115013 text = O ano passado foi um ano incrível e estou pronto para que possamos voltar ainda melhor em 2017.

1 O o DET DT Gender=Masc|Number=Sing 2 det 2 ano ano NOUN NN Gender=Masc|Number=Sing 6 nsubj 3 passado passado ADJ JJ Gender=Masc|Number=Sing 2 amod 4 foi AUX VBC Mood=Ind|Number=Sing|Person=3|Tense=Past 6 cop 5 um um DET DT Gender=Masc|Number=Sing 6 det 6 ano ano NOUN NN Gender=Masc|Number=Sing 0 root 7 incrível incrível ADJ JJ Gender=Masc|Number=Sing 6 amod 8 e e CCONJ CC 10 cc 9 estou AUX VBC Mood=Ind|Number=Sing|Person=1|Tense=Pres 10 cop 10 pronto pronto ADJ JJ Gender=Masc|Number=Sing 6 conj 11 para SCONJ IN 16 mark 12 que ADP IN 11 fixed 13 possamos AUX VBC Mood=Sub|Number=Plur|Person=1|Tense=Pres 16 aux 14 voltar AUX VB 16 cop 15 ainda ainda ADV RB 16 advmod 16 melhor ADJ JJR Gender=Masc|Number=Plur 10 advcl 17 em em ADP IN 18 case 18 2017 NUM CD Gender=Masc 16 obl SpaceAfter=No 19 . . PUNCT . 6 punct text_en = Last year was an incredible year and I’m ready for us to come back even better in 2017.

vcvpaiva commented 3 years ago

The verb 'proteger=to protect' is not auxiliary.

sent_id = n01128025 text = Primeiro, nós devemos educar as pessoas sobre como se proteger melhor, online. text_en = First, we must educate people on how to protect themselves better online.

1 Primeiro primeiro ADV RB 5 advmod SpaceAfter=No 2 , , PUNCT , 1 punct 3 nós PRON PRP Case=Nom|Number=Plur|Person=1 5 nsubj 4 devemos dever AUX VBC Mood=Ind|Number=Plur|Person=1|Tense=Pres 5 aux 5 educar VERB VB 0 root 6 as o DET DT Gender=Fem|Number=Plur 7 det 7 pessoas pessoa NOUN NN Gender=Fem|Number=Plur 5 obj 8 sobre ADP IN 12 case 9 como como ADV WRB 12 advmod 10 se PRON SE Person=3 12 expl:pv 11 proteger AUX VB 12 cop 12 melhor melhor ADJ JJR Gender=Masc|Number=Sing 5 xcomp SpaceAfter=No 13 , , PUNCT , 14 punct 14 online online ADV RB 12 advmod SpaceAfter=No 15 . . PUNCT . 5 punct _

vcvpaiva commented 3 years ago

The verb 'costumar=get used to' is not auxiliary nor cop.

newdoc id = n01135 sent_id = n01135002 text = O apetite insaciável da China por galinha frita costumava ser uma grande razão pela qual investidores amavam o pai do KFC, Yum Brand. text_en = China's insatiable appetite for fried chicken used to be a big reason why investors loved KFC-parent Yum Brands.

vcvpaiva commented 3 years ago

The verb 'chamar=be called' is not auxiliary.

newdoc id = w01031 sent_id = w01031003 text = O estudo dos vulcões é chamado vulcanologia. text_en = The study of volcanoes is called volcanology, sometimes spelled vulcanology.

newdoc id = w01135 sent_id = w01135034 text = Em 2011, Blunt foi nomeada embaixadora da nova fragrância da Yves Saint Laurent, Opium. text_en = In 2011, Blunt was named the ambassadress of the new Yves Saint Laurent fragrance, Opium.

vcvpaiva commented 3 years ago

The verb 'tender=to tend' is not auxiliary

sent_id = w01031015 text = Este magma tende a ser muito viscoso devido ao seu elevado teor de sílica, por isso muitas vezes não atinge a superfície, esfriando na profundidade. text_en = This magma tends to be very viscous due to its high silica content, so it often does not reach the surface but cools at depth.

vcvpaiva commented 3 years ago

The verb 'proclamar=to proclaim' is not auxiliary.

sent_id = w01039065 text = 1987 foi proclamado "O Ano do Rio" pelo então Presidente da Câmara de Brisbane, Sallyanne Atkinson. text_en = 1987 was proclaimed the "Year of the River" by the Lord Mayor of Brisbane at the time, Sallyanne Atkinson.

Second example: newdoc id = w01094 sent_id = w01094022 text = Ford T foi proclamado como o carro mais influente do século XX nos prêmios internacionais Carro do Século. text_en = Ford T was proclaimed as the most influential car of the 20th century in the international Car of the Century awards.

vcvpaiva commented 3 years ago

The verb 'morrer=to die' is not auxiliary. sent_id = w01057041 text = Foi predito que ele iria morrer de velhice depois de uma vida irrelevante, ou morrer jovem em um campo de batalha ganhando imortalidade através da poesia. text_en = It was foretold that he would either die of old age after an uneventful life, or die young in a battlefield and gain immortality through poetry.

vcvpaiva commented 3 years ago

The verb 'declarar=to declare' is not auxiliary. sent_id = w01111093 text = Winstone foi declarado falido em 4 de outubro de 1988 e novamente em 19 de março de 1993. text_en = Winstone was declared bankrupt on 4 October 1988 and again on 19 March 1993.

Second example: newdoc id = w01125 sent_id = w01125034 text = Wilkes foi reeleito e expulso mais duas vezes, antes de a Câmara dos Comuns decidir que a sua candidatura era inválida e declarar o segundo classificado vencedor. text_en = Wilkes was re-elected and expelled twice more, before the House of Commons resolved that his candidature was invalid and declared the runner-up as the victor.

vcvpaiva commented 3 years ago

The verb 'deixar=to leave' is not auxiliary. sent_id = w01113058 text = No entanto, seu rival Prabowo Subianto também declarou vitória, deixando os cidadãos indonésios confusos. text_en = However, his rival Prabowo Subianto also declared victory, leaving Indonesian citizens confused.

sent_id = w01127096 text = A esta altura, os gastos extravagantes e uma série de maus investimentos pelos seus investidores a tinham deixado quase falida. text_en = By this time, extravagant spending and a string of bad investments by her investors had left her nearly broke.

newdoc id = w01138 sent_id = w01138045 text = Durante este tempo, Marcelle foi deixada frequentemente sozinha no quarto enquanto Piaf e Mômone estavam nas ruas ou cantando no clube. text_en = During this time, Marcelle was often left alone in the room while Piaf and Mômone were out on the streets or at the club singing.

dan-zeman commented 3 years ago

I agree with all of the above. Ideally, I would remove the verbs from the list at https://quest.ms.mff.cuni.cz/udvalidator/cgi-bin/unidep/langspec/specify_auxiliary.pl?lcode=pt (which also demonstrates that Portuguese UD currently uses significantly more "auxiliary" verbs than any other Romance language). But I'm afraid that it would also invalidate the other Portuguese treebanks, which had the same problem, if I recall it correctly.

vcvpaiva commented 3 years ago

hmm of your list, I think

acabar agredir ameaçar andar chegar começar continuar costumar deixar ficar interpelar interpretar parar parecer passar permitir procurar quer querer recomeçar tender tornar

are not UD auxiliaries, but

dever | poder | querer | ir | vir | haver are missing from our list and they are already in other Romance languages.

I'm uncertain about saber | fazer

which both Italian and French have as auxiliaries.

The issue is reasonable, the English guidelines mentioned that aspect verbs could be auxiliary, these are all aspectual verbs in Portuguese, but they're main verbs in English.

dan-zeman commented 3 years ago

I think the question is where one draws a line between syntax and semantics. In English (as in some other Indo-European languages), the perfect aspect is expressed using an auxiliary verb (to have) and there is no doubt that this verb is auxiliary in this context. But other verbs that could be semantically classified as aspectual, such as to begin to do something, are not considered auxiliary because they do not pass syntactic tests for auxiliaries valid in English. Such tests differ language by language, so it is not so easy to say that a translation of an auxiliary verb should be also auxiliary in the target language (but I still maintain that the huge difference between Portuguese and the other Romance languages is at least suspicious).

vcvpaiva commented 3 years ago


I think the question is where one draws a line between syntax and semantics.

makes sense.

I'm no linguist, but all places I looked at, say ser' andestar' (copula), and 'ter' and 'haver'.

This work has a longer list of auxiliaries https://www.researchgate.net/publication/221050083_Auxiliary_Verbs_and_Verbal_Chains_in_European_Portuguese

which as you say are AUX according to the traditional Portuguese grammar. I noticed from your list that consensus has not been reached in the other Romance languages either.