UniversalDependencies / tools

Various utilities for processing the data.
GNU General Public License v2.0
205 stars 44 forks source link

Is there a script that can convert from PTB to UD in CoNLL-U style? #31

Closed freesunshine0316 closed 5 years ago

freesunshine0316 commented 5 years ago

Hi,

Is there a script within this repository that can convert from PTB to UD in CoNLL-U style? Thanks.

sebschu commented 5 years ago

Yes, the UD converter as part of Stanford NLP can do that.

https://nlp.stanford.edu/software/stanford-dependencies.html

However, the official release still converts to UD v1. You can download an experimental version of corenlp which converts to UD v2 here but note that this version hasn’t been extensively tested at this point.

https://nlp.stanford.edu/~sebschu/files/javanlp-core-src.jar


From: Dan Zeman notifications@github.com Sent: Wednesday, December 5, 2018 8:03 AM To: UniversalDependencies/tools Cc: Sebastian Schuster; Assign Subject: Re: [UniversalDependencies/tools] Is there a script that can convert from PTB to UD in CoNLL-U style? (#31)

Assigned #31https://github.com/UniversalDependencies/tools/issues/31 to @sebschuhttps://github.com/sebschu.

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHubhttps://github.com/UniversalDependencies/tools/issues/31#event-2007532098, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AAMHsDWDxZqh4pPkUiJX2gePkSuz5Wpxks5u1-44gaJpZM4ZDCKn.

freesunshine0316 commented 5 years ago

Thanks for the prompt reply @sebschu. By the way, I found that the dependency annotations of SANCL (https://sites.google.com/site/sancl2012/home/shared-task) is different from UD v1, and is different from Stanford dependency either. Do you know what annotation guideline it uses? Thanks.

jnivre commented 5 years ago

According to the documentation, it should be a version of Stanford dependencies, produced by the Stanford converter from PTB annotation.

freesunshine0316 commented 5 years ago

Hi, professor Nivre.

Thank you for your reply. I just started doing some projects on dependency parsing. I thought one key feature of Stanford dependency is that it moves prepositions into the edge labels, maybe I was confused by a Figure in https://nlp.stanford.edu/software/stanford-dependencies.html.

By the way, this is Linfeng Song, who interned at IBM this summer and Miryam was my office mate.

nschneid commented 5 years ago

I thought one key feature of Stanford dependency is that it moves prepositions into the edge labels, maybe I was confused by a Figure in https://nlp.stanford.edu/software/stanford-dependencies.html.

Figures 1 and 2 on that page show different varieties of Stanford Dependencies. Prepositions are included on the edge labels in Figure 1 but not Figure 2.

UD does not have prep or pobj labels; instead case, nmod, and (in UDv2) obl are used for prepositional phrases.

freesunshine0316 commented 5 years ago

@nschneid Thanks!