opencog / atomspace

The OpenCog (hyper-)graph database and graph rewriting system
https://wiki.opencog.org/w/AtomSpace
Other
829 stars 233 forks source link

Backward Chainer is accessing deleted atoms! #1342

Closed linas closed 7 years ago

linas commented 7 years ago

Pull request #1341 is making the BackwardChainerUTest fail. This is very unexpected, and suggests that there's a bug in the backward chainer. Here's what is happening.

The guile "print" function was altered to not not print any atoms that are not in any atomspace. This fixes issue #127 . If the atom to be printed is not in any atomspace, then the scheme object holding that atom is cleared. More precisely, scheme objects are handles (not atoms), and if the scheme object is a handle that was pointing at a deleted atom, then the handle is set to null.

The unexpected fail of the BackwardChainerUTest suggests that somewhere inside the chainer, there are scheme expressions that are accessing deleted atoms. I'm pretty sure this is not the desired behavior. Ergo, this bug report.

ngeiswei commented 7 years ago

The BackwardChainer does construct atoms not in any atomspaces at construction time, then add them to an atomspace. The problem I'm encountering is that the AtomSpace doesn't seem to digest well constructed atoms that have sub-hypergraphs in common. Here's an example

(AndLink
  (QuoteLink
    (ImplicationScopeLink
      (UnquoteLink
        (VariableNode "$V-4fe729a2") ; [8925176964880640663][18446744073709551615]
      ) ; [18175253940150267673][18446744073709551615]
      (UnquoteLink
        (VariableNode "$P-42099628") ; [1261462666883942742][18446744073709551615]
      ) ; [10511539642153569752][18446744073709551615]
      (UnquoteLink
        (VariableNode "$Q-209c03b2") ; [1012035496929740829][18446744073709551615]
      ) ; [10262112472199367839][18446744073709551615]
    ) ; [10913738998627920077][18446744073709551615]
  ) ; [10939863394903317992][18446744073709551615]
  (EvaluationLink
    (GroundedPredicateNode "scm: crisp-contraposition-scope-precondition") ; [574661800842930277][1]
    (QuoteLink
      (ImplicationScopeLink
        (UnquoteLink
          (VariableNode "$V-4fe729a2") ; [8925176964880640663][18446744073709551615]
        ) ; [18175253940150267673][18446744073709551615]
        (UnquoteLink
          (VariableNode "$P-42099628") ; [1261462666883942742][18446744073709551615]
        ) ; [10511539642153569752][18446744073709551615]
        (UnquoteLink
          (VariableNode "$Q-209c03b2") ; [1012035496929740829][18446744073709551615]
        ) ; [10262112472199367839][18446744073709551615]
      ) ; [10913738998627920077][18446744073709551615]
    ) ; [10939863394903317992][18446744073709551615]
  ) ; [13008747887769101668][18446744073709551615]
) ; [14925782981505009964][18446744073709551615]

what happens is that the sub-hypergraph

  (QuoteLink
    (ImplicationScopeLink
      (UnquoteLink
        (VariableNode "$V-4fe729a2") ; [8925176964880640663][18446744073709551615]
      ) ; [18175253940150267673][18446744073709551615]
      (UnquoteLink
        (VariableNode "$P-42099628") ; [1261462666883942742][18446744073709551615]
      ) ; [10511539642153569752][18446744073709551615]
      (UnquoteLink
        (VariableNode "$Q-209c03b2") ; [1012035496929740829][18446744073709551615]
      ) ; [10262112472199367839][18446744073709551615]
    ) ; [10913738998627920077][18446744073709551615]

gets inserted twice in the TypeIndex under 2 different Atom addresses. I am almost certain this duplication occurs due to fact that this sub-hypergraph appears twice in the children of the atom to add.

@linas I think this is a bug, not a feature, right?

linas commented 7 years ago

Its a bug. No atom should ever appear twice in the type index. I'm certainly surprised that such a bug is even possible by now; that code has been pretty tight for a long time now. The only recent changes to it that might destablize things would be the quote/unquote support you added -- make sure that quote/unquote is not skipping important verification steps. I can also vaguely day-dream some sort of weird multi-atomspace failure scenario ...

I'm willing to debug this, if there is some reproducible way to get at it.

ngeiswei commented 7 years ago

It can be reproduced here https://github.com/ngeiswei/atomspace/blob/fix-type-index-add-atom/tests/atomspace/AtomSpaceUTest.cxxtest#L984

I welcome help, but I'll try to fix it myself.

ngeiswei commented 7 years ago

The problem seems to be that the wrong hash is stored in _atom_store (the hash of a scope is different whether it is being quoted or not). Still need to understand what is the right fix...

ngeiswei commented 7 years ago

Removing unquoted from https://github.com/opencog/atomspace/blob/master/opencog/atomspace/AtomTable.cc#L293 fixes the problem, and doesn't break unit tests.

I don't feel confident about such a fix however as I don't have the full picture of how it all works.

linas commented 7 years ago

removing the unquoted does not smell right to me. However, I was never comfortable with how quote/unquote worked, it never seemed very compatible with atomese. Looking now.

linas commented 7 years ago

Well, after the recent merges, BackwardChainerUTest sometimes, but not always segfaults in

https://github.com/opencog/atomspace/blob/master/tests/rule-engine/BackwardChainerUTest.cxxtest#L580

linas commented 7 years ago

What is bizarre about this is that the handle itself is null (rather than being a handle pointing to a null atom). Which implies that somehow a null pointer was inserted into the HandleSeqSet traces = bc._trace_recorder.traces(); at line 545. which should be impossible, because HandleSeq's should always contain handles. Here is what I mean:

Thread 1 "BackwardChainer" received signal SIGSEGV, Segmentation fault.
opencog::Handle::value() const (this=this@entry=0x0)
    at /atomspace-git/opencog/atoms/base/Handle.cc:38
38  {
(gdb) bt
#0  opencog::Handle::value() const (this=this@entry=0x0)
    at /atomspace-git/opencog/atoms/base/Handle.cc:38
#1  0x0000555555565d7f in BackwardChainerUTest::test_criminal() (
    this=0x555555782480 <_ZL26suite_BackwardChainerUTest.lto_priv.92>)
    at /atomspace-git/build/tests/rule-engine/../../../tests/rule-engine/BackwardChainerUTest.cxxtest:580

It should be impossible to ever have a Handle this=0x0 yet that is what the above shows. So I really don't get it.

ngeiswei commented 7 years ago

I ran 2 dozens of times on 2 different systems but cannot reproduce this.

If you can reproduce it can you paste here the traces? Just look for the string [DEBUG] Traces: in the stdout.

linas commented 7 years ago

I'm running Debian stable. It has a newer compiler than Ubuntu stable, I believe. gcc --version gcc (Debian 6.3.0-18) 6.3.0 20170516

There is only one line that says Traces. From that line to the crash:


[DEBUG] Traces:
size = 0
[WARN] validate_clauses: Constant clauses removed from pattern (BindLink
  (UnquoteLink
    (VariableNode "$TyVs")
  )
  (AndLink
    (UnquoteLink
      (LocalQuoteLink
        (LocalQuoteLink
          (VariableNode "$P")
        )
      )
    )
    (EvaluationLink
      (GroundedPredicateNode "scm: true-enough")
      (UnquoteLink
        (VariableNode "$P")
      )
    )
  )
  (ExecutionOutputLink
    (GroundedSchemaNode "scm: conditional-full-instantiation-formula")
    (UnquoteLink
      (ListLink
        (VariableNode "$Q")
        (QuoteLink
          (ImplicationScopeLink
            (UnquoteLink
              (VariableNode "$TyVs")
            )
            (UnquoteLink
              (VariableNode "$P")
            )
            (UnquoteLink
              (VariableNode "$Q")
            )
          )
        )
        (VariableNode "$P")
      )
    )
  )
)

[WARN] validate_clauses: Removed (UnquoteLink
  (LocalQuoteLink
    (LocalQuoteLink
      (VariableNode "$P")
    )
  )
)

[WARN] validate_clauses: Constant clauses removed from pattern (BindLink
  (UnquoteLink
    (VariableNode "$TyVs")
  )
  (AndLink
    (UnquoteLink
      (LocalQuoteLink
        (LocalQuoteLink
          (VariableNode "$P")
        )
      )
    )
    (EvaluationLink
      (GroundedPredicateNode "scm: true-enough")
      (UnquoteLink
        (VariableNode "$P")
      )
    )
  )
  (ExecutionOutputLink
    (GroundedSchemaNode "scm: conditional-full-instantiation-formula")
    (UnquoteLink
      (ListLink
        (VariableNode "$Q")
        (QuoteLink
          (ImplicationScopeLink
            (UnquoteLink
              (VariableNode "$TyVs")
            )
            (UnquoteLink
              (VariableNode "$P")
            )
            (UnquoteLink
              (VariableNode "$Q")
            )
          )
        )
        (VariableNode "$P")
      )
    )
  )
)

[WARN] validate_clauses: Removed (UnquoteLink
  (LocalQuoteLink
    (LocalQuoteLink
      (VariableNode "$P")
    )
  )
)
linas commented 7 years ago

I see, the crash is "obvious", this fixes it:

--- a/tests/rule-engine/BackwardChainerUTest.cxxtest
+++ b/tests/rule-engine/BackwardChainerUTest.cxxtest
@@ -576,8 +576,11 @@ void BackwardChainerUTest::test_criminal()

        // Extract the first trace, as understandable by the trace fitness
        std::set<ContentHash> trace;
-       for (const Handle& h : *(traces.begin()))
-               trace.insert(h.value());
+       if (traces.begin() != traces.end())
+       {
+               for (const Handle& h : *(traces.begin()))
+                       trace.insert(h.value());
+       }
        AndBITFitness trace_fitness(AndBITFitness::Trace, trace);
        BackwardChainer tr_bc(_as, top_rbs, target, vardecl, nullptr, nullptr,
                              Handle::UNDEFINED, BITNodeFitness(), trace_fitness);

Do you really mean to use just the first trace? It seems fishy to me.

ngeiswei commented 7 years ago

What is fishy is that there are no traces recorded at all. (it uses the first one but any trace would do).

Does the problem go away is you increase the number of iterations

https://github.com/opencog/atomspace/blob/master/tests/rule-engine/bc-criminal-config.scm#L49

say to 1000 or so?

Can you set the ure logger level to debug and send/attach the log?

linas commented 7 years ago

Also, perhaps we should re-discuss the use of quote and unquote; perhaps it might be better if these were replaced by some sort of MetaBindLink that did "the right thing". I don't know what "the right thing" would be here, but it seems increasingly clear that quote-unquote is not a good solution to whatever the problem is.

linas commented 7 years ago

how do I set the ure logger debug level?

ngeiswei commented 7 years ago

https://github.com/opencog/atomspace/blob/master/tests/rule-engine/BackwardChainerUTest.cxxtest#L65

linas commented 7 years ago

setting iterations to 1500 still gives [DEBUG] Traces: size = 0

ngeiswei commented 7 years ago

That can't be right, in that case the log would be useful to me.

linas commented 7 years ago

github doesn't seem to have any way of attaching log files ... right?

ngeiswei commented 7 years ago

I think you can drag/drop any file to the comment area.

misgeatgit commented 7 years ago

@linas you can use gist.github.com

linas commented 7 years ago

https://pastebin.com/nGyg8Y4a

linas commented 7 years ago

I can't figure out how to use gist.github.com .. what's the secret for putting stuff into it?

ngeiswei commented 7 years ago

I would need you to send me the opencog.log, cause printing the ure log on the stdout is gonna be too big. You'll need to compress it I think. But you can just drop it in you github message, it supports any file format.

linas commented 7 years ago

look at the pastebin. I believe it is the entire log.

linas commented 7 years ago

The entire log, start to finish, is 1182 lines long, this includes setting URE to DEBUG, and changing iterations to 1500. I just cut-n-paste it into pastebin, but cut-n-paste doesn't seem to work for gists.github nor any other sequence of magic keystrokes that I could muster.

ngeiswei commented 7 years ago

No, you'd need to enable stdout https://github.com/opencog/atomspace/blob/master/tests/rule-engine/BackwardChainerUTest.cxxtest#L68 but if you do so the size is gonna explode.

Better lower the number iterations back to 500, the log will be smaller that way.

linas commented 7 years ago

Sorry I don't know how to "drag and drop" a file. WTF. drag and drop it from where, exactly? I'll just email it.

linas commented 7 years ago

attached to email. its 700Kbytes

On Wed, Aug 30, 2017 at 10:22 PM, ngeiswei notifications@github.com wrote:

No, you;d need to enable stdout https://github.com/opencog/ atomspace/blob/master/tests/rule-engine/BackwardChainerUTest.cxxtest#L68 but if you do so the size is gonna explode.

Better slower the number iterations back to 500, the log will be smaller that way.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/opencog/atomspace/issues/1342#issuecomment-326178777, or mute the thread https://github.com/notifications/unsubscribe-auth/AAFwoBv2O2QP3wmOZF_VUgj2aw7wLj-1ks5sdicHgaJpZM4O6ooE .

-- "The problem is not that artificial intelligence will get too smart and take over the world," computer scientist Pedro Domingos writes, "the problem is that it's too stupid and already has."

ngeiswei commented 7 years ago

The log looks fine. I have no clue what is going on. It runs fine on my systems

I see no reason it wouldn't work on Debian stretch. I could either tailor some log messages for debugging that, or perhaps install Debian stretch in a VM, or log into your system.

ngeiswei commented 7 years ago

Debian stretch doesn't run in a VM. I can log but then the window manager freezes...

linas commented 7 years ago

Hmm. I just did a make clean :-) and now it works.

linas commented 7 years ago

Oh great. Now ./tests/rule-engine/ForwardChainerUTest fails once very 3-4 tries. It's never failed before.

/build/tests/rule-engine/../../../tests/rule-engine/ForwardChainerUTest.cxxtest:135: Error: Expected (results.find(Fritz_green) != results.end()), found ({ 00 00 00 00 00 00 00 00  })
linas commented 7 years ago

It failed the 1st, 5th and 27th time I ran it. So 22 times in a row, it passed ...