Open sebastianruder opened 10 years ago
It almost surely best to avoid three-point functions like this. I think you want:
filled_by (jar, $x)
where $x could be "Peter"
also, the passive-rule-2 seems backwards to me, I'm thinking it should be
filled(jar, $x) where $x could be marbles
should have @ruiting and @rodas comment
Hi Passive -rule-1 is for cases like "The jar is filled with yellow marbles by Peter." r2l output :
EvaluationLink PredicateNode "filled@123" ListLink ConceptNode "Peter@345" ConceptNode "jar@789"
and Passive-rule-2 is applied when we don't know by whom the action is done "The jar is filled with yellow marbles."
EvaluationLink PredicateNode "filled@123" ListLInk VariableNode "$x" ConceptNode "jar@789"
and for with(fill, marbles) it should be :
EvaluationLink PredicateNode "with@11" ListLink PredicateNode "filled@123" ConceptNode "marbles@987"
@ruiting comment
The restriction tense($A, past_passive) is actually stopping passive-rule1 from being applied for "The jar is filled with yellow marbles by Peter." or "The jar is being filled with yellow marbles by Peter."
yes , i just fixed it https://github.com/rodsol/relex/commit/cc209903e7a525e146fbf3e3f34a5f94324b3bc0
On Wed, Jul 9, 2014 at 11:26 AM, William Ma notifications@github.com wrote:
The restriction tense($A, past_passive) is actually stopping passive-rule1 from being applied for "The jar is filled with yellow marbles by Peter." or "The jar is being filled with yellow marbles by Peter."
— Reply to this email directly or view it on GitHub https://github.com/opencog/relex/issues/89#issuecomment-48442605.
@williampma
it seems PASSIVE1-1 and PASSIVE1 are redundant
is there anyway the two rules can be merge ?
Yeah, I agree it's a bit redundant.
How about
_obj($A,$B) & by($A,$C) & tense($A,$type) => (passive-rule1 $type ...)
and check $type in the scheme function to see it equals "**_passive", and do nothing otherwise.
I can actually think of a way to allow regular expression like
_obj($A,$B) & by($A,$C) & tense($A, .*\Qpassive\E) => ...
though I am not sure how often this functionality is needed.
If we don't want to use three-point-functions, then we would need a rule to deal with with(fill, marbles)
and -- as @rodsol proposed -- produce
EvaluationLink
PredicateNode "with@11"
ListLink
PredicateNode "filled@123"
ConceptNode "marbles@987"
To avoid redundancy, we should have a rule that could deal with all such relations, e.g. through(x, y), about(x, y), etc. and produce a PredicateNode for them. What do you say?
So I assume that "filled by Peter" becomes
EvaluationLink
PredicateNode "by@211"
ListLink
PredicateNode "filled@123"
ConceptNode "Peter@1987"
@linas, with that logic, yes. Although I think we would want to limit the scope to a certain set of terms since in the case of "by", Peter should rather occupy the first argument position of the EvaluationLink. What do you think?
1) I'm sorry that I ever said "filled_by" in the early remark, I wasn't thinking.
2) by(filled, Peter) and with(filled, marbles) is the correct order. In both cases, "filled" is the "prepostiional subject", and "Peter/marbles" is the "prepositional object". As a general rule, the parent comes first, the dependent comes after.
Some systems would write this as _psubj(by, filled) _pobj(by, Peter) _psubj(with, filled) _pobj(with, marbles) but relex doesn't do this by default (it will if you turn on stanford compatibility mode).
Which of these styles we should use in r2l .. I don't know. Which might be easier to reason with .. I don't know. Which of these is easier for PLN?
Some years ago, I actually used the form "filled_by(jar, peter) which seemed like a good idea at the time, but fell apart when I needed to say "was slowly filled by"
I think it would be useful for PLN to supplement what we already have. In the current implementation, there is no strict differentiation between semantic roles, objects, etc. For me, if we want to avoid three-point-functions, intuitively, it feels most natural to add EvaluationLinks with new PredicateNodes such as "through", "with", etc. to provide additional information. I don't see how this information could otherwise be fit in without altering the current semantics of nodes and links. I would only do this approach, though, for a small set of needed expressions at first as IMO particularly phrasal verbs are still something that should be discussed more in-depth. I don't think that for every phrasal verb, the particle should be split off as a separate PredicateNode, but rather that they should form an entity together, e.g. "look_up", "look_for", etc. Or maybe I'm confusing two things here: Phrasal verbs which can only work with the particle; and adjuncts, e.g. "with marbles", "by Peter". These should be handled differently. On a related note, would you say that there should be a differentiation between adjuncts and complements since there can only be one complement but a potentially infinite number of adjuncts? @cosmoharrigan, what is your take on this whole issue? How should we capture adjuncts such as "by Peter", "with marbles", etc. to be able to deal with them with PLN?
Hi Rodas,
I'm saying two distinct things:
1) Since you are one of the people working on r2l I wanted you to review Sebastian's work and make sure it does not conflict with your plans.
2) I was saying that, if one designs some of these structures poorly, then attaching an adjunct becomes impossible. So, as you look for a data structure for "look_for", you should think about how to represent "superficially look for". This was a mistake I once made. Which is why reviews are needed.
@linas, one remark: Here you mentioned that it is best to avoid three-point-functions. How about ditransitive verbs, e.g. sell(x,y,z)
, though. Surely they would require three-point-functions, right?
Hi @sebastianruder .. yes, and no and it depends on context. At the 'surface dependency parse' level, we can avoid 3-point functions, since sell(x,y,z) is equivalent to _subj(sell, x) _obj(sell, y) _iobj(sell, z). The latter representation is nice because it easily allows for verb-modifiers: e.g. softly-sell adds one more relation _advmod(sell, softly). With the 3-point function, the same modification gets messy and confusing: should it be sell(x,y,z,w) with w==softly? Should it be _advmod(sell(x,y,z), w)? should it be a different function entirely, called softly_sell(x,y,z)? Whichever scheme you pick, it seems that one can find some sentence that either breaks the scheme or forces even more complexity. So: "he casually but quickly sold the iPhone to John", how do I conjoin the two modifiers? "He sold the iPhone to John in a casual but quick manner" what does the preposition "in" connect to? would it be in(sell(x,y,z),w)? something else? All these have standard answers in the 2-point case. That's why I was recommending against it.
However... it's widely known that ditransitive verbs have an arity of 3, (named subj, obj and iobj) and, at deeper, semantic layers of analysis, you want to have all three parts tied together (or at least, its very convenient). It can simplify the use of "lexical functions" in meaning-text-theory, for example.
In r2l, we are translating from the 'surface dependency parse' level to some deeper quasi-semantic level, on which we hope to apply PLN. So the question then becomes: "is it easier to use one single glob sell(x,y,z), or is it easier to use three relations _subj(sell, x) _obj(sell, y) _iobj(sell, z)?" I don't know the answer to that. In some mathematical, formal sense, these two different structures are equivalent; so ease-of-use is your only decision criterion.
I do know that, in the past, when doing something similar, I got burned by this. I was trying to convert sentences into those fabled "web 2.0" or "semantic web" triples that the triple-store and sparQL people love so much. Which is easy and works great when your sentences are easy and simple, and completely falls apart once you get to real-life examples. So I guess I'm saying "OK, but be careful".
@rodsol, @ruiting, how would you represent adjuncts such as "by Peter", "with marbles" now? With their own PredicateNode as Rodas initially proposed?
(EvaluationLink
(PredicateNode "with@11")
(ListLink
(PredicateNode "filled@123")
(ConceptNode "marbles@987")))
@williampma, has this issue been resolved? How are adjuncts captured at the moment?
I don't think this is resolved yet. Currently there isn't any R2L rule for adjuncts. There is a on-rule
scheme helper function, but the corresponding R2L rule is missing, so it is never invoked. I am not sure what the current decision is.
The dangling on-rule
does follow a similar structure Rodas purposed, with its own PredicateNode "on"
but without instance (so it is not on@1111
).
Alright. What do you think about implementing a rule which creates a PredicateNode for certain relations, i.e. with(x, y), through(x, y) and about(x, y) for a start. It would for instance produce for with(fill, marbles):
EvaluationLink
PredicateNode "with@11"
ListLink
PredicateNode "filled@123"
ConceptNode "marbles@987"
as I stated in this comment and as Rodas originally proposed as well.
@williampma, @AmeBel, do scoped variables work at the moment and if so, could you briefly explain to me how to use them so we can have one rule for all three relations? I've seen the varscope.txt
file, but I haven't come across an example where they are actually used.
about-rule
exists (I forgot about it), but again it is without instance number (so it is PredicateNode "about"
and not PredicateNode "about@1111"
). If this implementation is correct, then I think the solution for with(x,y)
should also be without instance number (just PredicateNode "with"
).
"Scoped variable" is currently hard-coded for the MAYBE rule, so I guess in a sense it does not really work. On the other hand, my recent implementation of regular expression in #152 could potentially be used for "scoped variable", so I guess the old implementation for MAYBE will be extincted (in fact, I might do that eventually). However, I don't think scoped variables applies to your case. It is for something like with(x, scope)
and allows scope
to be matched to an array of different words. ie, the scope is for the word in the relation. It does not work for the name of the relation like scope(x, y)
, which is what you want I think?
Hi all, For a sentence such as "The jar is filled with yellow marbles", the passive-rule2 produces the following EvaluationLink:
The VariableNode would bind an agent, if it was present, e.g. "by Peter". There would need to be another representation to involve the marbles, such as making the verb phrasal:
What are your thoughts on this?