Open guoquan opened 4 years ago
It took me a while to think of what was the divergence. I think the problem is @hfaghihi15 and @kordjamshidi look at different perspectives.
@hfaghihi15 was asking what are the candidates for the events. How many predictions we will generate.
I think the answer is, we will make a prediction on each trigger and that means candidates for the events is in correspondence to triggers.
This was partially why I had event.has_a(trigger)
. Because as for relation.has_a(span, span)
, the relation is generated from the combination of a span and another span. triplet.has_a(in, lm, tr)
, the triplet is generated from the combination of indicator, landmark, and trajectory. So here the event is generated by the trigger itself, which is kind of consistent.
We can have "pivot" relation instead of the single argument has_a relation, to have special semantic.
@kordjamshidi suggest to involve role_event_person, role_event_time, role_event_locate
in relation definition. Because it should be possible that each event should be determined by different sources of information, features, etc.
I think we misunderstand by thinking for involve
in the same way as for has_a
. It shouldn't change the fact that we want to make N predictions where N is the number of triggers. The predictions are still in correspondence to triggers. The difference is we want to make use of information from the other sources like role_event_person
.
As with the current framework, I think it should be better to consider this in model description part.
Because the connection is always there event we don't specify involve
. The connection is role.has_a(trigger, span)
(I prefer role.has_a(even, span)
).
From the learner side, we can use trigger to event edge, and provide the above information and the learner is free to use them, sum them, concat them, or whatever.
Something like the following code
class Event_classification(torch.nn.module):
def forward(trigger, role_event_person, role_event_time, role_event_locate):
...
merry_has_a_trigger['backward'] = EdgeLearner(role_event_person, role_event_time, role_event_locate, mode='backward', model=EventClassification())
Probably, we don't need this "involve()" because this should be the flexibility of the learner and should declare this for learners.
One reason Parisa propose involve is that we can write constraint for event at one place. The desired way is
ifL(merry, (x, xy,xz,xw), andL(trigger, x, role_event_person, xy, role_event_time, xz, role_event_locate, xw))
We want to get merry -> (x, xy,xz,xw)
from involve
semantic.
This basically says, if (x, xy,xz,xw) is a merry event, xz must be time.
There is one potential problem with this expression, that some arguments are optional, and merry may not be expressed by that many variables.
For example, if one merry event does not has time mentioned.
An alternative way to write this without involve
semantic could be
ifL(merry, (x,), andL( ifL( role_event_person, (x, y), PER, (y,)), ifL( role_event_time, (x, y), timex2, (y,)), ifL( role_event_locate, (x, y), LOC, GPE, FAC, (y,)), )
This expression use nested if
, so the arguments to roles are optional.
So you are classifying the x as merry here? and x will be trigger? ifL(merry, (x,),
It took me a while to think of what was the divergence. I think the problem is @hfaghihi15 and @kordjamshidi look at different perspectives.
@hfaghihi15 was asking what are the candidates for the events. How many predictions we will generate. I think the answer is, we will make a prediction on each trigger and that means candidates for the events is in correspondence to triggers. This was partially why I had
event.has_a(trigger)
. Because as forrelation.has_a(span, span)
, the relation is generated from the combination of a span and another span.triplet.has_a(in, lm, tr)
, the triplet is generated from the combination of indicator, landmark, and trajectory. So here the event is generated by the trigger itself, which is kind of consistent. We can have "pivot" relation instead of the single argument has_a relation, to have special semantic.@kordjamshidi suggest to involve
role_event_person, role_event_time, role_event_locate
in relation definition. Because it should be possible that each event should be determined by different sources of information, features, etc. I think we misunderstand by thinking forinvolve
in the same way as forhas_a
. It shouldn't change the fact that we want to make N predictions where N is the number of triggers. The predictions are still in correspondence to triggers. The difference is we want to make use of information from the other sources likerole_event_person
. As with the current framework, I think it should be better to consider this in model description part. Because the connection is always there event we don't specifyinvolve
. The connection isrole.has_a(trigger, span)
(I preferrole.has_a(even, span)
). From the learner side, we can use trigger to event edge, and provide the above information and the learner is free to use them, sum them, concat them, or whatever. Something like the following codeclass Event_classification(torch.nn.module): def forward(trigger, role_event_person, role_event_time, role_event_locate): ... merry_has_a_trigger['backward'] = EdgeLearner(role_event_person, role_event_time, role_event_locate, mode='backward', model=EventClassification())
Probably, we don't need this "involve()" because this should be the flexibility of the learner and should declare this for learners.
I agree with what you described above, I only can not comment on this backward
property since I do not fully understand it.
Actually, thinking about our discussions today, I think the idea of having some rules for constructing the desired output based on the predicted parts is something that we often will need when working on complex outputs. So this will be a different inference as @auszok said (If I understood him correctly), usually these will be deterministic rules of composition after finding the most probable parts of the output as the result of ILP.
It took me a while to think of what was the divergence. I think the problem is @hfaghihi15 and @kordjamshidi look at different perspectives.
@hfaghihi15 was asking what are the candidates for the events. How many predictions we will generate. I think the answer is, we will make a prediction on each trigger and that means candidates for the events is in correspondence to triggers. This was partially why I had
event.has_a(trigger)
. Because as forrelation.has_a(span, span)
, the relation is generated from the combination of a span and another span.triplet.has_a(in, lm, tr)
, the triplet is generated from the combination of indicator, landmark, and trajectory. So here the event is generated by the trigger itself, which is kind of consistent. We can have "pivot" relation instead of the single argument has_a relation, to have special semantic.@kordjamshidi suggest to involve
role_event_person, role_event_time, role_event_locate
in relation definition. Because it should be possible that each event should be determined by different sources of information, features, etc. I think we misunderstand by thinking forinvolve
in the same way as forhas_a
. It shouldn't change the fact that we want to make N predictions where N is the number of triggers. The predictions are still in correspondence to triggers. The difference is we want to make use of information from the other sources likerole_event_person
. As with the current framework, I think it should be better to consider this in model description part. Because the connection is always there event we don't specifyinvolve
. The connection isrole.has_a(trigger, span)
(I preferrole.has_a(even, span)
). From the learner side, we can use trigger to event edge, and provide the above information and the learner is free to use them, sum them, concat them, or whatever. Something like the following codeclass Event_classification(torch.nn.module): def forward(trigger, role_event_person, role_event_time, role_event_locate): ... merry_has_a_trigger['backward'] = EdgeLearner(role_event_person, role_event_time, role_event_locate, mode='backward', model=EventClassification())
Probably, we don't need this "involve()" because this should be the flexibility of the learner and should declare this for learners.
I think an edge learner will make things more complicated. Consistently with all other parts of our framework, I assume what we want here is a simple classifier which uses the property of the concept and some other properties based on the concept relations. (However, I don't agree with the architecture to involve those relations as input to the decider as we already have those in the inference time and it doesn't make sense to me to have aggregation over all possible span relations as input to one specific trigger classifier) I suggest to keep rhe involve property to play the role of the shortcut to define some constraints ( same as disjoint) and treat the trigger classifier ( or if you want to establish an edge between event and trigger and have classifiers there) as classes such as divorce, marry and etc.
Actually, thinking about our discussions today, I think the idea of having some rules for constructing the desired output based on the predicted parts is something that we often will need when working on complex outputs. So this will be a different inference as @auszok said (If I understood him correctly), usually these will be deterministic rules of composition after finding the most probable parts of the output as the result of ILP.
That's true. I guess that would be a different kind of deterministic scoring which I think is useful in some cases but maybe not in the ACE example.
Actually, thinking about our discussions today, I think the idea of having some rules for constructing the desired output based on the predicted parts is something that we often will need when working on complex outputs. So this will be a different inference as @auszok said (If I understood him correctly), usually these will be deterministic rules of composition after finding the most probable parts of the output as the result of ILP.
That's true. I guess that would be a different kind of deterministic scoring which I think is useful in some cases but maybe not in the ACE example.
If event classifier is not modeled explicitly, in that case we will need this.
Actually, thinking about our discussions today, I think the idea of having some rules for constructing the desired output based on the predicted parts is something that we often will need when working on complex outputs. So this will be a different inference as @auszok said (If I understood him correctly), usually these will be deterministic rules of composition after finding the most probable parts of the output as the result of ILP.
That's true. I guess that would be a different kind of deterministic scoring which I think is useful in some cases but maybe not in the ACE example.
If event classifier is not modeled explicitly, in that case we will need this.
Correct. But are we planning to do that for this paper? And we also have to assume that the argument roles and types have enough information to decide about an event type which is probably not the case here.
Actually, thinking about our discussions today, I think the idea of having some rules for constructing the desired output based on the predicted parts is something that we often will need when working on complex outputs. So this will be a different inference as @auszok said (If I understood him correctly), usually these will be deterministic rules of composition after finding the most probable parts of the output as the result of ILP.
That's true. I guess that would be a different kind of deterministic scoring which I think is useful in some cases but maybe not in the ACE example.
If event classifier is not modeled explicitly, in that case we will need this.
Correct. But are we planning to do that for this paper? And we also have to assume that the argument roles and types have enough information to decide about an event type which is probably not the case here.
Ok, I agree.
If we want to conclude here for modeling ACE, I guess we can even postpone defining involve
relation right now and do the event trigger as the concept and all other as pairwise edges in connection with trigger with has-a. Would that be problematic for now?
If we want to conclude here for modeling ACE, I guess we can even postpone defining
involve
relation right now and do the event trigger as the concept and all other as pairwise edges in connection with trigger with has-a. Would that be problematic for now?
The modeling should be fine, I guess. @guoquan mentioned nested relations in datanode, is that resolved?!
@auszok the nested relation problem is long resolved, right?
We had a discussion about modeling event in the framework. We can continue the discussion on this topic in this thread.
Below are some code snippets from the discussion.