codeaudit / dkpro-core-asl

Automatically exported from code.google.com/p/dkpro-core-asl
0 stars 0 forks source link

Change span for Dependency annotation type and remove Governor and Dependency types #72

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Currently we have two ways to model dependencies:

- using only the Dependency type
- using the Governor and Dependency types as well.

The problem is the whole sentence is annotated as the text span for all 
dependency annotations in a sentence. Consider the example sentence on the 
Stanford's page. Currently the we annotates 6 "Dependency" annotations on the 
whole span of the example sentence "My dog also likes eating sausage". So we 
can't easily reach the dependency information from the token span (say I want 
to know the dependency relations in which a specific token participates). It 
would have been nice, if, for example the Dependency annotation marks the span 
of the Governor (the head of the dependency relation), then the dependent and 
the dependency-type can stay as the features of the annotation. 

Original issue reported on code.google.com by richard.eckart on 24 Jun 2012 at 7:57

GoogleCodeExporter commented 9 years ago
That would be a nice enhancement.

Original comment by torsten....@gmail.com on 25 Jun 2012 at 6:54

GoogleCodeExporter commented 9 years ago

Original comment by richard.eckart on 13 Oct 2012 at 6:31

GoogleCodeExporter commented 9 years ago

Original comment by richard.eckart on 16 Feb 2013 at 11:03

GoogleCodeExporter commented 9 years ago

Original comment by richard.eckart on 16 Feb 2013 at 11:04

GoogleCodeExporter commented 9 years ago
Issue 163 has been merged into this issue.

Original comment by richard.eckart on 24 Jun 2013 at 8:37

GoogleCodeExporter commented 9 years ago

Original comment by richard.eckart on 24 Jun 2013 at 10:23

GoogleCodeExporter commented 9 years ago
This issue was closed by revision r1497.

Original comment by richard.eckart on 24 Jun 2013 at 10:32

GoogleCodeExporter commented 9 years ago
This issue was closed by revision r1500.

Original comment by richard.eckart on 25 Jun 2013 at 9:17

GoogleCodeExporter commented 9 years ago
We (EXCITEMENT team in Heidelberg) have been designing some codes that needs to 
access POS tags and dependency relations together: something like finding a 
specific POS tag (e.g. prefix of a separable verb) and its dependency link to 
reconstructe the lemma of separable verb was one such example. 

While coding for this, we realized that iterating over each token and getting 
Dependency type that is spanning over each of the token (as Dependent) would be 
ideal in some cases. (Do something in single pass over the tokens, etc). 

Compared to put span over "Governor", putting the span of Dependency on the 
"Dependent" can be more useful, just as Richard iterated in the DKPro user 
mailing list. 

=== snip === 
You are suggesting now that it would be better to use the span of the 
dependent. Given that there should only be one dependency per dependent, but 
multiple dependencies per governor, I think you have a good point. It may be a 
good idea to change the behavior again. Would you care reopening the issue and 
explaining your rationale there? 
=== 

With this setup, we can safely assume (mostly, I guess) one dependency per one 
token: with this, some usage cases can be simpler and easier. 

Original comment by nohtae...@gmail.com on 1 Aug 2013 at 6:05

GoogleCodeExporter commented 9 years ago
I think Gil has a point here. Any objections anybody?

Original comment by richard.eckart on 1 Aug 2013 at 6:09

GoogleCodeExporter commented 9 years ago
Tokens can be both dependents and governors, e.g. all the sentences with 
dependent clauses (subject or object clauses).
E.g.
"He says that your sister likes to swim” ccomp(says, likes) where like is 
both a dependent in the ccomp relation and a governor in the relation 
nsubj(likes, sister)

Is that still covered?

Original comment by eckle.kohler on 1 Aug 2013 at 7:33

GoogleCodeExporter commented 9 years ago
Yes. There is no suggestion to change the structure of the Dependency type. If 
continues bearing two features (governor and dependent) which point to tokens. 
The only thing that should change is the offsets of the Dependency annotation 
itself. In 1.4.0, there was no standard. Currently, following the original 
suggestion in this issue, the Dependency offsets correspond to those of the 
governor. Now it is suggested that they should correspond to those of the 
dependent. 

Original comment by richard.eckart on 1 Aug 2013 at 7:56

GoogleCodeExporter commented 9 years ago
ok, I see.
That the Dependency offsets correspond to the dependant is probably useful in 
some, maybe many use cases.

Although there are many verbs that introduce arguments which are dependents of 
two governors:
these are all the verbs taking to-infinitives with subject or object control or 
subject- or object-raising verbs.

Here are some examples: (I use dep as dependency relation type to simplify 
things)

verb taking to-infinitive with subject control:
She likes to swim: dep(likes,she), dep(swim,she)

verb taking to-infinitive with object control:
she persuades him to swim: dep(persuades,him), dep(swim,him)

see also my ISOcat definitions:
ISOcat is currently down, but I found Subcat-LMF here: 
http://lux13.mpi.nl/isocat/rest/dcs/550 and cite my examples from there:

Example: persuade is an object control verb, e.g. We persuaded him to stay.

Example: believe is an object raising verb, e.g. They believe him to be an 
informant.

Example: try is a subject control verb, e.g. He tried to exercise.

Example: seem is a subject raising verb, e.g. He seems to sleep.

--- Judith

Original comment by eckle.kohler on 2 Aug 2013 at 12:01

GoogleCodeExporter commented 9 years ago
Thanks for the input. 

Right now we also have the Dependencies stacking up on the governors (there can 
be more than one dependency per governors and per dependent). After the 
proposed change they may be stacking up on some dependents but only in a few 
cases, such as you pointed out. So no matter where the stuff is stacking up, 
our data structure can handle that just nice.

For curiosity: did you try running your examples through a dependency parser, 
e.g. the online demo of the Stanford tools? I could well imagine that the 
parsers fail to recognize this correct. 

Original comment by richard.eckart on 2 Aug 2013 at 12:32

GoogleCodeExporter commented 9 years ago
>> For curiosity: did you try running your examples through a dependency 
parser, e.g. the online demo of the Stanford tools? I could well imagine that 
the parsers fail to recognize this correct. 

That is actually funny: the online demo does not even analyse the examples 
given in the Stanford dependencies manual correctly 
(http://nlp.stanford.edu/software/dependencies_manual.pdf)

e.g.
"Tom likes to eat fish” xsubj(eat, Tom)

the demo produces only an xcomp dependency.

See also the related question on stackoverflow
http://stackoverflow.com/questions/15909216/different-xsubj-dependency-output-fr
om-corenlp-and-stanford-dependency-parser

Original comment by eckle.kohler on 2 Aug 2013 at 12:49

GoogleCodeExporter commented 9 years ago
Since nobody has objected so far, the change will be implemented for 1.5.0. The 
span of the Dependency annotation shall match the span of its dependent.

Original comment by richard.eckart on 5 Aug 2013 at 9:42

GoogleCodeExporter commented 9 years ago
This issue was closed by revision r1730.

Original comment by richard.eckart on 10 Aug 2013 at 2:33

GoogleCodeExporter commented 9 years ago
This issue was closed by revision r1731.

Original comment by richard.eckart on 10 Aug 2013 at 2:40

GoogleCodeExporter commented 9 years ago
This issue was closed by revision r1732.

Original comment by richard.eckart on 10 Aug 2013 at 2:41