opencog / relex

English Dependency Relationship Extractor
http://wiki.opencog.org/w/RelEx
Apache License 2.0
85 stars 69 forks source link

R2L rules for dealing with temporal RelEx relations (e.g. _time(), after(), before()) #124

Open sebastianruder opened 10 years ago

sebastianruder commented 10 years ago

As pointed out here, I'd like to come up with a temporal reasoning demo which covers the whole pipeline -- from parsing natural language with RelEx and RelEx2Logic until performing inference with PLN on its output. As there are no RelEx2Logic rules which deal with temporal relations yet, I propose to write R2L rules which convert the following RelEx relationship into the respective atom space representation: 1) "She had lunch at 6 pm."

_time(pm, 6), _at(have, pm)

-->

(AtTimeLink
    (TimeNode "6 pm")
    (EvaluationLink
        (PredicateNode "have")
        (ListLink
        ???)

2) "She had lunch before she went to work."

before(have, go)

-->

(BeforeLink
    (EvaluationLink
        (PredicateNode "have")
        (ListLink
            ???))
    (EvaluationLink
         (PredicateNode "go")
         (ListLink
             ???)))

3) "She had lunch after she got home."

after(have, get)

-->

(AfterLink
    (EvaluationLink
        (PredicateNode "have")
        (ListLink
            ???))
    (EvaluationLink
         (PredicateNode "go")
         (ListLink
             ???)))

Concerning 1): As UnixTime is used as default, the TimeNode in the first example should only consist of a number which should be set according to some system, e.g. 6 pm = 12_60 + 6_60 = 1080. If the time is stated by using a form of "to be", e.g. "The concert is at 6 pm", the relation is _pobj(at, pm) instead of at(have, pm). Concerning 1)+2)+3): The relation link should consist of the links/nodes that are part of its relation. These are produced by other rules, e.g. SV, SVO, SVP, and would need to be accessible to the temporal rule so they can be used inside of the ListLink. Are these desirable representations and are these rules that you consider useful? I'm glad about your feedback.

ruiting commented 10 years ago

These are definitely useful. Maybe it's better to use PredicateNode instead of inventing those new link types, according to what @AmeBel pointed out http://wiki.opencog.org/w/EvaluationLink#As_syntactic_sugar

(EvaluationLink 
      (PredicateNode "after")
      (EvaluationLink
          (PredicateNode "have")
             (ListLink
               ???))
       (EvaluationLink
            (PredicateNode "go")
            (ListLink
               ???)))
sebastianruder commented 10 years ago

Thanks for your comment, @ruiting. What you point out sounds reasonable. Otherwise we would need 7 distinct links for temporal reasoning alone. @AmeBel, @bgoertzel, would you mind commenting as well?

bgoertzel commented 10 years ago

Sure, using predicate node is ok ...

On Saturday, July 19, 2014, Sebastian Ruder notifications@github.com wrote:

Thanks for your comment, @ruiting https://github.com/ruiting. What you point out sounds reasonable. Otherwise we would need 7 distinct links for temporal reasoning alone. @AmeBel https://github.com/AmeBel, @bgoertzel https://github.com/bgoertzel, would you mind commenting as well?

— Reply to this email directly or view it on GitHub https://github.com/opencog/relex/issues/124#issuecomment-49507365.

Ben Goertzel, PhD http://goertzel.org

"In an insane world, the sane man must appear to be insane". -- Capt. James T. Kirk

"Emancipate yourself from mental slavery / None but ourselves can free our minds" -- Robert Nesta Marley

sebastianruder commented 10 years ago

@ruiting, @AmeBel, @williampma, should these rules create a "timemarker" which can then be processed in the post-processing component? I'm not quite up-to-date on the current status of the post-processing. Should I implement this or has this already been discussed and someone is assigned to this and you'd suggest that I should focus on something else?

bgoertzel commented 10 years ago

I think this is an important area to deal with, because information coming out of the game world is going to have temporal information attached, so it will be very useful to be able to handle temporal information coming from language as well ...

So far as I know nobody else is working on it...

ben

On Tue, Aug 12, 2014 at 7:31 AM, Sebastian Ruder notifications@github.com wrote:

@ruiting https://github.com/ruiting, @AmeBel https://github.com/AmeBel, should these rules create a "timemarker" which can then be processed in the post-processing component? I'm not quite up-to-date on the current status of the post-processing. Should I implement this or has this already been discussed and someone is assigned to this and you'd suggest that I should focus on something else?

— Reply to this email directly or view it on GitHub https://github.com/opencog/relex/issues/124#issuecomment-51902473.

Ben Goertzel, PhD http://goertzel.org

"In an insane world, the sane man must appear to be insane". -- Capt. James T. Kirk

"Emancipate yourself from mental slavery / None but ourselves can free our minds" -- Robert Nesta Marley

williampma commented 10 years ago

This certainly seems doable in post-processing with a "timemarker", by searching the related links containing the node "have" and "go" in the above example. It might be better to follow the EvaluationLink atom ListLink atom atom structure.

More complex sentence might end up with

(EvaluationLink 
      (PredicateNode "after")
      (ListLink
          (AndLink
                  ** hypergraphs involving the word "have", ie. the first word from timemarker **
          )
          (AndLink
                  ** hypergraphs involving the word "go", ie. the second word from timemarker**
          )
      )
)

Though I find it to be very strange for RelEx to produce relations on the word "have" like _at(have, pm)... I would have thought it would be _at(lunch, pm)...

If you create the R2L rule and helper function for creating the "timemarker", I could go and create the post-processing function for "timemarker" for you. I want to see how the other markers' post-processing functions will interact with this one, since the order different "marker" are post-processed will change the final result for more complex sentences.

williampma commented 10 years ago

The PredicateNode solution works for "after" and "before", but what about the first example for "6 pm"? What is the purposed representation for that?

And one more thing about a general "timemarker", since it needs to work for both "after" and "before", I guess the word "after" and "before" need to be included as part of the "timemarker".

EvaluationLink
  timemarker
  ListLink
     after <--- like this??
     have
     go
williampma commented 10 years ago

Actually (sorry, kind of brainstorming), why do we need post-processing?

With

EvaluationLink
   PredicateNode "after"
   ListLink
      have@1111
      go@2222

PLN should be able to inference from the instance "have@1111" and "go@2222", and get to the links created by SVO, SVIO, SV, etc, right?

sebastianruder commented 10 years ago

@williampma, true. If we just refer to the unique PredicateNodes, PLN will be able to reason with them. Somehow I had envisioned embedding the whole EvaluationLink but I think this works equally well and doesn't require post-processing.

sebastianruder commented 10 years ago

@williampma, I created the three rules described above and their corresponding helper functions here and here. Would you mind reviewing them? In the before() and after() RelEx relations, the second argument can be either a verb, a pronoun or a noun. For example, in "I ate after her" it is a pronoun, in "I went home after she left" it is a verb, etc. How is the part-of-speech relation in RelEx named? I copied pos() from the that-rules but that doesn't seem to work for me. Also, when parsing "I had lunch at 6 pm" now, I don't get the relation _time(pm, 6). What has changed? Maybe @linas, could also comment briefly?

williampma commented 10 years ago

About the part-of-speech thing: looks like you found a bug I thought I completely fixed in #94 but it seems I have missed a case. I will work on fixing it.

williampma commented 10 years ago

BTW, with regards to whether post-processing is needed, and whether

EvaluationLink
   PredicateNode "after"
   ListLink
      have@1111
      go@2222

is enough for PLN, it has been an ongoing confusion for me. I recall I have a similar confusion when working with that-rule, and here's @bgoertzel's reply (which I still do not fully understand unfortunately; maybe it will provide some insight for you)


Hmmm, so you think

Evaluation that tell delicious@1234
Inheritance pumpkin delicious@1234

Is the same as

Evaluation that tell@1234 (Inheritance Pumpkin delicious)

That's true if nobody adds links to delicious@1234 later , I guess ...

But, suppose we use the first representation given above.

If the Atomspace also has

Inheritance pumpkin cute

in it, then PLN induction will yield

Inheritance delicious@1234 cute <.01>

or something like that.... Then we will have

Inheritance delicious@1234 cute <.01>

in the Atomspace.... But then, when you look at

Evaluation that tell delicious@1234

-- how will you know that the relevant fact is

"I tell you that pumpkin is delicious"

rather than

"I tell you that pumpkin is delicious and maybe cute"

??

On the other, we can say

Evaluation 
     that 
     tell@1234 
     Inheritance Pumpkin delicious <1,.99>

with confidence that nobody is going to change

Inheritance Pumpkin delicious

into something else later on.

bgoertzel commented 10 years ago

Hi,

BTW, with regards to whether post-processing is needed, and whether

EvaluationLink PredicateNode "after" ListLink have@1111 go@2222

is enough for PLN,

I can answer this if someone reminds me what sentence is being represented; I have poor Internet connection today and don't want to sort through an archive of messages ;p

it has been an ongoing confusion for me. I recall I have a similar confusion when working with that-rule, and here's @bgoertzel https://github.com/bgoertzel's reply (which I still do not fully understand unfortunately;

Oops, sorry about that..

The practical crux of that email was that this representation

Evaluation that tell@1234 Inheritance Pumpkin delicious <1,.99>

seems unproblematic.

However, after that discussion we decided to use representations like this instead:

Evaluation

 that
 tell@1234

 ContextAnchorNode "123"

EmbeddedTruthValueLink

AnchorNode "123"

Inheritance Pumpkin delicious <1,.99>

However

Evaluation that tell delicious@1234

Inheritance pumpkin delicious@1234

seems different because delicious@1234 could get many other links attached to it later on, after it's in the Atomspace. So the meaning of delicious@1234 is not immutable; I'm not sure how important this is...

ben

linas commented 10 years ago

whether

EvaluationLink PredicateNode "after" ListLink have@1111 go@2222

is enough for PLN,

The original sentence is "she had lunch after she got home".

The alternatives to ponder are: "she had lunch shortly after she got home"."she had lunch long after she got home"."she had lunch after she got home, but not before getting dressed for the party". and of course, "But Ben told me that she had lunch before she got home."

bgoertzel commented 10 years ago

So the alternative is

EvaluationLink PredicateNode "after" EvaluationLink have@123 she@123 lunch@123 Evaluation go@123 she@123 home@123

This alternative constitutes a link that is guaranteed to always represent what the sentence said (without further elaboration or deduction, etc.)...

On the other hand,

EvaluationLink PredicateNode "after" have@123 go@123

EvaluationLink have@123 she@123 lunch@123 Evaluation go@123 she@123 home@123

is not guaranteed to always represent what the sentence said, because later the system might figure out that

Inheritance go@123 slowly

Then, looking at

EvaluationLink PredicateNode "after" have@123 go@123

EvaluationLink have@123 she@123 lunch@123 Evaluation go@123 she@123 home@123 Inheritance go@123 slowly

there seems no way for the system to figure out which links were part of the sentence and which were added later by some other process.

So if seems if we are going to use

EvaluationLink PredicateNode "after" have@123 go@123

EvaluationLink have@123 she@123 lunch@123 Evaluation go@123 she@123 home@123

then we might want something like an InterpretationNode, with semantics like

ReferenceLink SentenceNode "she had lunch after she got home" SetLink EvaluationLink PredicateNode "after" have@123 go@123 EvaluationLink have@123 she@123 lunch@123 Evaluation go@123 she@123 home@123

specifying which links constitute the direct interpretation of the sentence. If we have something like this, then either representation is fine....

-- Ben

On Fri, Aug 15, 2014 at 5:10 PM, Linas Vepstas notifications@github.com wrote:

whether

EvaluationLink PredicateNode "after" ListLink have@1111 go@2222

is enough for PLN,

The original sentence is "she had lunch after she got home".

The alternatives to ponder are: "she had lunch shortly after she got home"."she had lunch long after she got home"."she had lunch after she got home, but not before getting dressed for the party". and of course, "But Ben told me that she had lunch before she got home."

— Reply to this email directly or view it on GitHub https://github.com/opencog/relex/issues/124#issuecomment-52309552.

Ben Goertzel, PhD http://goertzel.org

"In an insane world, the sane man must appear to be insane". -- Capt. James T. Kirk

"Emancipate yourself from mental slavery / None but ourselves can free our minds" -- Robert Nesta Marley

linas commented 10 years ago

Why is it important to preserve the original sentence? The original sentence structures showing words and word order are still there, if they weren't deleted.

Now, philosophicaqlly speaking, yes its usually best to apply transformations that don't loose any information, so if we can get a representation that faithful to the original sentence, then we probably should.... and maybe this is early eough in the processing that we should not gratuitously discard information if we don't yet have to....and so should be explicit that this principle applies.

But, at some point, after analyzing dozens of sentences that are telling a story, its more important to get the story straight, than to memorize word order. If, on retelling, the word-order comes out different, that's OK, as long as the gist of the story is presevrved.

bgoertzel commented 10 years ago

Yes, in general, the system won't need to preserve knowledge regarding what it saw in what sentence... as you say, it's retaining the sentence content that matters

However (as you allude in your first paragraph), we don't want to commit to an early-stage representation that immediately throws out knowledge of what information was contained in each sentence. Because on occasion this information will be useful....

ben

On Fri, Aug 15, 2014 at 9:07 PM, Linas Vepstas notifications@github.com wrote:

Why is it important to preserve the original sentence? The original sentence structures showing words and word order are still there, if they weren't deleted.

Now, philosophicaqlly speaking, yes its usually best to apply transformations that don't loose any information, so if we can get a representation that faithful to the original sentence, then we probably should.... and maybe this is early eough in the processing that we should not gratuitously discard information if we don't yet have to....and so should be explicit that this principle applies.

But, at some point, after analyzing dozens of sentences that are telling a story, its more important to get the story straight, than to memorize word order. If, on retelling, the word-order comes out different, that's OK, as long as the gist of the story is presevrved.

— Reply to this email directly or view it on GitHub https://github.com/opencog/relex/issues/124#issuecomment-52338883.

Ben Goertzel, PhD http://goertzel.org

"In an insane world, the sane man must appear to be insane". -- Capt. James T. Kirk

"Emancipate yourself from mental slavery / None but ourselves can free our minds" -- Robert Nesta Marley

sebastianruder commented 10 years ago

If we choose the representation

(EvaluationLink
    (PredicateNode "after")
    (ListLink
        (PredicateNode "have@1111")
        (PredicateNode "go@2222")))

shouldn't it then later be possible to retrieve all the relevant and additional information to the sentence by looking up the incoming sets of the PredicateNodes, i.e. (EvaluationLink have@123 she@123 lunch@123) and (EvaluationLink go@123 she@123 home@123). I imagine we would be able to reconstruct the sentence this way without the need to provide the full interpretation here. What do you think?

williampma commented 10 years ago

The main point I don't fully understand is this

seems different because delicious@1234 could get many other links attached to it later on, after it's in the Atomspace. So the meaning of delicious@1234 is not immutable

Let's modify the above to talk about "after" instead. I guess I understand other links could attach to nodes like "go@2222" afterward for

EvaluationLink
    PredicateNode "after"
    ListLink
        PredicateNode "have@1111"
        PredicateNode "go@2222"

like from anaphora resolution perhaps. But how is that different from

EvaluationLink
    PredicateNode "after"
    EvaluationLink have@123 she@123  lunch@123
    Evaluation go@123  she@123  home@123

Wouldn't other links still be able to attach to the node "go@2222" with this representation? What is the difference?

williampma commented 10 years ago

And I think we have already been breaking apart sentences even for simple sentence like The bird flies high

EvaluationLink flies@1111 bird@2222
InheritanceLink 
    SatisfyingSetLink flies@1111
    high@3333
bgoertzel commented 10 years ago

On Mon, Aug 18, 2014 at 5:29 AM, William Ma notifications@github.com wrote:

The main point I don't fully understand is this

seems different because delicious@1234 could get many other links attached to it later on, after it's in the Atomspace. So the meaning of delicious@1234 is not immutable

Let's modify the above to talk about "after" instead. I guess I understand other links could attach to nodes like "go@2222" afterward for

EvaluationLink PredicateNode "after" ListLink PredicateNode "have@1111" PredicateNode "go@2222"

like from anaphora resolution perhaps.

So, the above relation says, roughly:

(A) "The instance of having called have@1111 occured after the instance of going called go@2222"

But how is that different from

EvaluationLink PredicateNode "after" EvaluationLink have@123 she@123 lunch@123 Evaluation go@123 she@123 home@123

This one says, roughly,

(B) "The event in which 'the instance of having called have@123 occurs between the instance of she called she@123 and the instance of lunch called lunch@123" occured after the event in which 'the instance of going called go@123 occurs between the instance of she called she@123 and the instance of home called home@123' "

Of course, A can be made equivalent to B by adding on additional information to the effect that

A1) 'the instance of having called have@123 occurs between the instance of she called she@123 and the instance of lunch called lunch@123'

and

A2) 'the instance of going called go@123 occurs between the instance of she called she@123 and the instance of home called home@123' "

...

Then, modulo some rearrangement,

A + A1 + A2 = B

My point was just that the meaning of the sentence is either B or A+ A1+A2; so just linking the sentence to A with a ReferenceLink or whatever isn't enough to specify the meaning of the sentence. If one just links the sentence to A, then one has no way to tell whether the meaning of the sentence is just A, or A+A1+A2, or A+A3+A4, or whatever...

So, breaking apart sentences is OK, but if we want to map a sentence S to its meaning, we need to say either

ReferenceLink S B

or

ReferenceLink S SetLink {A, A1, A2}

or something like that...

ben

Wouldn't other links still be able to attach to the node "go@2222" with this representation? What is the difference?

— Reply to this email directly or view it on GitHub https://github.com/opencog/relex/issues/124#issuecomment-52445194.

Ben Goertzel, PhD http://goertzel.org

"In an insane world, the sane man must appear to be insane". -- Capt. James T. Kirk

"Emancipate yourself from mental slavery / None but ourselves can free our minds" -- Robert Nesta Marley

williampma commented 10 years ago

ReferenceLink S SetLink {A, A1, A2}

Interesting... @AmeBel's recent code for SuReal will already do this, so with that functionality are we saying it's OK to keep the sentence broken apart? No post-processing needed for rules that simply tries to bring things together (eg. that-rule, before-rule, after-rule, and-rule, but-rule, or-rule, etc)??

bgoertzel commented 10 years ago

On Mon, Aug 18, 2014 at 6:08 AM, William Ma notifications@github.com wrote:

ReferenceLink S SetLink {A, A1, A2}

Interesting... @AmeBel https://github.com/AmeBel's recent code for SuReal will already do this, so with that functionality are we saying it's OK to keep the sentence broken apart? No post-processing needed for rules that simply tries to bring things together (eg. that-rule, before-rule, after-rule, and-rule, but-rule, or-rule, etc)??

Possibly .... can you remind me of the link to the wiki page that shows how these rules work? I'd like to review the rules before making a definite pronouncement...

thx ben

williampma commented 10 years ago

Here you go http://wiki.opencog.org/w/RelEx2Logic_representation

Of the rules I mentioned, only the "that-rule" has post-processing implemented, since the other rules are still under discussion.

On the other hand, all-rule, maybe-rule, etc would probably still require post-processing I assume, since they are not simply just bring in links together, but actually modifying them.

bgoertzel commented 10 years ago

But how is that different from

EvaluationLink PredicateNode "after" EvaluationLink have@123 she@123 lunch@123 Evaluation go@123 she@123 home@123

This one says, roughly,

(B) "The event in which 'the instance of having called have@123 occurs between the instance of she called she@123 and the instance of lunch called lunch@123" occured after the event in which 'the instance of going called go@123 occurs between the instance of she called she@123 and the instance of home called home@123' " ​

I don't get how a non temporal instance she@123 occur between other instances ???? Or are you relating the planar graph with the temporal semantics?

In line with (A) i interpreted the above as,

The instance of she called she@123 had a lunch called lunch@123 after she called she@123 go to a home called home@123.

What am i missing?

bgoertzel commented 10 years ago

Ah, sorry for confusing language.... I just meant


(B) "The event in which 'the instance of having called have@123 occurs with first argument being instance of she called she@123 and with second argument being the instance of lunch called lunch@123" occured after the event in which 'the instance of going called go@123 occurs with first argument the instance of she called she@123 and with second argument the instance of home called home@123' " ​


-- no temporality was implied...

-- ben

On Mon, Aug 18, 2014 at 8:41 AM, Amen Belayneh amenbelayneh@gmail.com wrote:

But how is that different from

EvaluationLink PredicateNode "after" EvaluationLink have@123 she@123 lunch@123 Evaluation go@123 she@123 home@123

This one says, roughly,

(B) "The event in which 'the instance of having called have@123 occurs between the instance of she called she@123 and the instance of lunch called lunch@123" occured after the event in which 'the instance of going called go@123 occurs between the instance of she called she@123 and the instance of home called home@123' " ​

I don't get how a non temporal instance she@123 occur between other instances ???? Or are you relating the planar graph with the temporal semantics?

In line with (A) i interpreted the above as,

The instance of she called she@123 had a lunch called lunch@123 after she called she@123 go to a home called home@123.

What am i missing?

Ben Goertzel, PhD http://goertzel.org

"In an insane world, the sane man must appear to be insane". -- Capt. James T. Kirk

"Emancipate yourself from mental slavery / None but ourselves can free our minds" -- Robert Nesta Marley

bgoertzel commented 10 years ago

On Mon, Aug 18, 2014 at 7:20 AM, William Ma notifications@github.com wrote:

Here you go http://wiki.opencog.org/w/RelEx2Logic_representation

Of the rules I mentioned, only the "that-rule" has post-processing implemented, since the other rules are still under discussion.

Hmm, so if we didn't do post-processing, what would the "final" representation of

" I know that he stupidly thinks that she bought the cake. "

look like?

On the other hand, all-rule, maybe-rule, etc would probably still require post-processing I assume, since they are not simply just bring in links together, but actually modifying them.

— Reply to this email directly or view it on GitHub https://github.com/opencog/relex/issues/124#issuecomment-52450287.

Ben Goertzel, PhD http://goertzel.org

"In an insane world, the sane man must appear to be insane". -- Capt. James T. Kirk

"Emancipate yourself from mental slavery / None but ourselves can free our minds" -- Robert Nesta Marley

williampma commented 10 years ago

Kind of like the EvaluationLink after have go solution I would imagine

EvaluationLink
   know
   ListLink I

EvalatuionLink
   thinks
   ListLink he

InheritanceLink thinks stupidly

EvalatuionLink
   bought
   ListLink she cake

EvaluationLink
   that
   ListLink
      know
      thinks

EvaluationLink
   that
   ListLink
      thinks
      bought
linas commented 10 years ago

To repeat myself:

-- The goal here is NOT to precisely reconstruct sentences, but rather, to preserve as much of the original sentence as reasonable/possible during these various transformations. That is, don't discard or blur information prematurely.

What Ben is saying is that, by breaking the sentence apart into many distinct clauses, the origin of the clauses becomes quickly obscured. As later sentences come in and supply additional facts, and PLN does some basic reasoning, it becomes muddier and muddier as to which sentence asserted which facts. It becomes hard to distinguish the facts that were deduced (by PLN) from the facts that were asserted (by the sentences). Thus, it seems that we should try to avoid this muddying for as long as possible. By avoiding such muddying, it should be possible to make more accurate deductions, and more easily catch contraditions and inconsistencies, and resolve them.

capiche?

bgoertzel commented 10 years ago

Assuming the terms that span multiple relationships such as "that" and "thinks" have specifications like that@123 or think@567, I think this is OK ... so long as there is also a SetLink or similar telling you that these all came from the same sentence...

On Tue, Aug 19, 2014 at 4:58 AM, William Ma notifications@github.com wrote:

Kind of like the EvaluationLink after have go solution I would imagine

EvaluationLink know ListLink I

EvalatuionLink thinks ListLink he

InheritanceLink thinks stupidly

EvalatuionLink bought ListLink she cake

EvaluationLink that ListLink know thinks

EvaluationLink that ListLink thinks bought

— Reply to this email directly or view it on GitHub https://github.com/opencog/relex/issues/124#issuecomment-52581291.

Ben Goertzel, PhD http://goertzel.org

"In an insane world, the sane man must appear to be insane". -- Capt. James T. Kirk

"Emancipate yourself from mental slavery / None but ourselves can free our minds" -- Robert Nesta Marley

amebel commented 10 years ago

The final output of r2l are

(InterpretationLink
                    (InterpretationNode sentence@123_parse_0_interpretation_$X)
                    (ParseNode sentence@123_parse_0))

(ReferenceLink 
                    (InterpretationNode sentence@123_parse_0_interpretation_$X)
                   (SetLink
                          r2l-outputs
                     ))

In line with http://wiki.opencog.org/w/Linguistic_interpretation