Shen-Language / shen-sources

Shen language kernel sources for porters
Other
354 stars 41 forks source link

Shen type checker (tc +) problem #33

Closed bluejay77 closed 7 years ago

bluejay77 commented 7 years ago

I have in my hands a proprietary program, written by somebody else. Therefore, I cannot post it here in its entirety.

The program is something that I currently am developing. It is the AI resolution algorithm, which is one of the best known theorem proving functionalities.

It will make the CLisp Shen 19.2 crash, and it will make the SBCL Shen 19.1 inifinite loop, so that the program needs to be halted with a Control-C.

Dr Tarver recommended for me to use the (spy +) expression, to see the cognitions of the Shen type checker. I did so. Below comes the code which will make the CLisp Shen crash, and the SBCL Shen seemingly go into an infinite loop.


(define resolve
  {clause --> clause --> (list clause)}
    C1 C2 -> (let SC1 (stpart C1)
                  Product (cartprod SC1 C2)
                  (rrh SC1 C2 Product)))

(define rrh
  {clause --> clause --> (list (wff * wff)) --> (list clause)}
   _ _ [] -> []
   C1 C2 [(@p W1 W2) | Pairs] ->
      (let MGU (complements W1 W2)
          (if (unify-fails? MGU)
              (rrh C1 C2 Pairs)
              (let Clause (union (deref-clause C1 MGU) 
                                 (deref-clause C2 MGU))
                   Literals [(deref-wff W1 MGU) (deref-wff W2 MGU)]
                   Result (difference Clause Literals)
                   [Result | (rrh C1 C2 Pairs)]))))

From looking into the cogitations of the Shen type checking functionality with the (spy +), I have seen that the code

[(@p W1 W2) | Pairs]

in the function (rrh ...) will make the Shen type checker infinite loop, even though the said code above is perfectly correct Shen.

Therefore, I have speculated that there could be something wrong with the Shen type checker (tc +), wrt to the pairs.

The selfsame phenomenon will occur with another program, with the code

(list (number * string))

so that once again, I suspect the Shen type checker (tc +) processing pairs. That another program is the ~2200 lines ANN N-level backpropagation algorithm.

I do admit that this is a fairly weak description of the issue, but I would need to know the Shen type checker and its output with the (spy +) significantly better, in order to be able to create a better Github Shen issue entry.

yours, Dr AJY Helsinki, Finland, the EU

doublec commented 7 years ago

Are you able to do a reduced test case? Slowly remove code to narrow it down to a smaller code sample you can share?

bluejay77 commented 7 years ago

Yes; I can.

Following comes a relatively short program, which contains no pairs, but which however will make the OS Shen v. 20 (tc +) infinite loop.


\\ Resolution theorem proving
\\
\\ Dr Antti Ylikoski and Dr Mark Tarver 2015-10-05--
\\
\\ There almost certainly is a bug in the Shen type checker, if one
\\ does
\\ (tc +)
\\ (possibly) (spy +)
\\ (load "resinfloop.shen") \\ This file here
\\ the Shen type checker will infinite loop in the (mk-th ...)
\\ function, which is a very simple one, and should cause no
\\ problems. 
\\

(package resolution [ mk-th ]

(datatype term

  F : symbol; X : (list term);
  =============================
  [F | X] : term;

  F : number;
  ___________
  F : term;

  F : string;
  ___________
  F : term;

  F : symbol;
  ___________
  F : term;

  F : boolean;
  ____________
  F : term;)

(datatype wff

P : symbol;
___________
P : wff;

P : wff;
============
[~ P] : wff;

if (element? C [v => <=> &])
P : wff; Q : wff;
=================
[P C Q] : wff;

if (element? Q [all exists])
X : symbol; P : wff;
========================
[Q X P] : wff;

F : symbol; X : (list term);
============================
[F | X] : wff;)

(datatype wff2 \\ Substitute for (wff * wff), because of a  problem w/
               \\ (tc +) 
A: wff;
B: wff;
========================================
[ A B ]: wff2;
)

(datatype term2 \\ Substitute for (term * term)
A: term;
B: term;
==================================================
[ A B ]: term2;
)

\\ Type theory of (map ...) eg
\\ (map (/. X (deref-wff X MGU)) C)

(datatype maptype
________________________________________
map : ((A --> B) --> (list A) --> (list B));
)

(datatype wff-set
Wffs: (list wff);
==================================================
Wffs: wff-set;
)

\\ The theory abstract datatype:

(datatype theory
ThN: string;  \\ The string name of the theory
PTh: wff-set; \\ Predicate logic form theory, to prove P2P from
P2P: wff;     \\ Predicate logic form theorem to prove
C2P: wff-set; \\ Clauses form of theorem to prove
SOS: wff-set; \\ Wffs which constitute the Set of Support
Cl:  wff-set; \\ The set of all clauses (intentionally slightly redundant)
MaxR: number; \\ The maximum number of resolutions which are run
==================================================
[   ThN
    PTh
    P2P
    C2P
    SOS
    Cl
    MaxR ]: theory;
)

(define mk-th \\ Make Theory
     { string --> wff-set --> wff --> wff-set --> wff-set -->
        wff-set --> number --> theory }
    ThN
    PTh
    P2P
    C2P
    SOS
    Cl
    MaxR ->
        [   ThN
            PTh
        C2P
            P2P
            SOS
            Cl
            MaxR ])

)

Do as follows:

shen
(tc +)
\\ optionally (spy +)
(load "resinfloop.shen")  \\ The above file

and the OS Shen 20 will crash at an infinite loop in the type checker.

yours, AJY

bluejay77 commented 7 years ago

The Github has done something very weird with the Shen code that I attempted to post.

Is there a functionality there to post verbatim program code?

AJY

tizoc commented 7 years ago

@bluejay77 edit the comment with the code, and wrap the code using three backquotes (`) at the beginning and three at the end.

tizoc commented 7 years ago

@bluejay77 you have two of the elements of the lists flipped in your mk-th constructor.

(define mk-th \\ Make Theory
     { string --> wff-set --> wff --> wff-set --> wff-set -->
       wff-set --> number --> theory }
    ThN
    PTh
    P2P
    C2P
    SOS
    Cl
    MaxR ->
        [   ThN
            PTh
            P2P  \\ was C2P -- flipped
            C2P  \\ WAS P2P -- these two
            SOS
            Cl
            MaxR ])

As for Shen crashing when the code has not been corrected, I don't know, I would have to understand the type theory you are defining there to be able to tell if something was missing (like a cut for example) or there is some conflict to be able to tell.

tizoc commented 7 years ago

@bluejay77 for example, is this correct?

(4+) [s => f] : resolution.wff
[s => f] : resolution.wff

(5+) [s => f] : resolution.wff-set
[s => f] : resolution.wff-set
bluejay77 commented 7 years ago

The intention of the programmer (initially Dr Tarver) to my understanding was:


if (element? C [v => <=> &])
P : wff; Q : wff;
=================
[P C Q] : wff;

hence it should be that

(4+) [s => f] : resolution.wff
[s => f] : resolution.wff

is correct, but then:


(datatype wff-set
Wffs: (list wff);
==================================================
Wffs: wff-set;
)

which implies that it should have been:

(5+) [[s => f]] : resolution.wff-set
[[s => f]] : resolution.wff-set

ie. a wff-set (a set of well-formed formulas) could be a set with one wff, ie. a list with one member, one wff.

yours, AJY

bluejay77 commented 7 years ago

Thank you for correcting my mistake; however, with the two members of the list corrected, I still get the same:

Microsoft Windows [Version 10.0.14393]
(c) 2016 Microsoft Corporation. Kaikki oikeudet pidätetään.

c:\Users\Antti Ylikoski\Dropbox\ShenResolution>shen
shen

Shen, copyright (C) 2010-2015 Mark Tarver
www.shenlanguage.org, Shen 20.0
running under Common Lisp, implementation: SBCL
port 2.0 ported by Mark Tarver

(0-) (tc +)
true

(1+) (load "resinfloop.shen")

resolution.type#term : symbol
resolution.type#wff : symbol
resolution.type#wff2 : symbol
resolution.type#term2 : symbol
resolution.type#maptype : symbol
resolution.type#wff-set : symbol
resolution.type#theory : symbol
maximum inferences exceeded~%

(2+)

It is slightly wrong to say that Shen "crashes", when it simply will infinite loop and end with "maximum inferences exceeded".

yours, AJY

bluejay77 commented 7 years ago

The type theory in question is really simple. It contains that which is there in the "resinfloop.shen" file that I posted in the issue discussion, nothing more than that.

I myself cannot see any potential problem of complication, which would necessitate the use of a cut, or a similar device.

AJY

tizoc commented 7 years ago

@bluejay77 it loads fine for me after flipping the variables order, and the typechecking of mk-th gets resolved in 259 inferences. I'm using Shen-SBCL, running the kernel version 20.0.

If I don't fix the order, it enters an infinite loop, switching between trying to prove that C2P is a symbol and that it is a resolution.wff.

What I mentioned about cut is related to the example I gave in a previous comment, where the same value can be both a wff and a wwf-set, which I guess may be related to the typechecker getting into an infinite loop.

If you flip two other arguments that cannot be confused (ThN, a string and P2P a wff for example), the typechecker doesn't enter an infinite loop, because there is no ambiguity there, a string is never a wff and vice versa.

Does this make sense?

tizoc commented 7 years ago

Here is the output from my REPL and the copy of the file as I have it on my computer: https://gist.github.com/tizoc/591d5ea4154f681ad84371df3ced0964

bluejay77 commented 7 years ago

Here is the file resinfloop.shen as it is right now in my PC:

\\ Resolution theorem proving
\\
\\ Dr Antti Ylikoski and Dr Mark Tarver 2015-10-05--
\\
\\ There almost certainly is a bug in the Shen type checker, if one
\\ does
\\ (tc +)
\\ (possibly) (spy +)
\\ (load "resinfloop.shen") \\ This file here
\\ the Shen type checker will infinite loop in the (mk-th ...)
\\ function, which is a very simple one, and should cause no
\\ problems. 
\\

(package resolution [ mk-th ]

(datatype term

  F : symbol; X : (list term);
  =============================
  [F | X] : term;

  F : number;
  ___________
  F : term;

  F : string;
  ___________
  F : term;

  F : symbol;
  ___________
  F : term;

  F : boolean;
  ____________
  F : term;)

(datatype wff

P : symbol;
___________
P : wff;

P : wff;
============
[~ P] : wff;

if (element? C [v => <=> &])
P : wff; Q : wff;
=================
[P C Q] : wff;

if (element? Q [all exists])
X : symbol; P : wff;
========================
[Q X P] : wff;

F : symbol; X : (list term);
============================
[F | X] : wff;)

(datatype wff2 \\ Substitute for (wff * wff), because of a  problem w/
               \\ (tc +) 
A: wff;
B: wff;
========================================
[ A B ]: wff2;
)

(datatype term2 \\ Substitute for (term * term)
A: term;
B: term;
==================================================
[ A B ]: term2;
)

\\ Type theory of (map ...) eg
\\ (map (/. X (deref-wff X MGU)) C)

(datatype maptype
________________________________________
map : ((A --> B) --> (list A) --> (list B));
)

(datatype wff-set
Wffs: (list wff);
==================================================
Wffs: wff-set;
)

\\ The theory abstract datatype:

(datatype theory
ThN: string;  \\ The string name of the theory
PTh: wff-set; \\ Predicate logic form theory, to prove P2P from
P2P: wff;     \\ Predicate logic form theorem to prove
C2P: wff-set; \\ Clauses form of theorem to prove
SOS: wff-set; \\ Wffs which constitute the Set of Support
Cl:  wff-set; \\ The set of all clauses (intentionally slightly redundant)
MaxR: number; \\ The maximum number of resolutions which are run
==================================================
[   ThN
    PTh
    P2P
    C2P
    SOS
    Cl
    MaxR ]: theory;
)

(define mk-th \\ Make Theory
     { string --> wff-set --> wff --> wff-set --> wff-set -->
.       wff-set --> number --> theory }
    ThN
    PTh
    P2P
    C2P
    SOS
    Cl
    MaxR ->
        [   ThN
            PTh
            P2P
        C2P
            SOS
            Cl
            MaxR ])

)

And this is what it will produce:


Microsoft Windows [Version 10.0.14393]
(c) 2016 Microsoft Corporation. Kaikki oikeudet pidätetään.

c:\Users\Antti Ylikoski\Dropbox\ShenResolution>shen
shen

Shen, copyright (C) 2010-2015 Mark Tarver
www.shenlanguage.org, Shen 20.0
running under Common Lisp, implementation: SBCL
port 2.0 ported by Mark Tarver

(0-) (tc +)
true

(1+) (load "resinfloop.shen")

resolution.type#term : symbol
resolution.type#wff : symbol
resolution.type#wff2 : symbol
resolution.type#term2 : symbol
resolution.type#maptype : symbol
resolution.type#wff-set : symbol
resolution.type#theory : symbol
maximum inferences exceeded~%

(2+) 

I cannot think what could be wrong.

AJY

tizoc commented 7 years ago

@bluejay77 why does the type signature of mk-th have a dot? I thought it was an artifact of you pasting it the first time, but now I see it there again. Thats the only difference between your file and my copy.

bluejay77 commented 7 years ago

That dot "." does not belong in there. Almost certainly it is a random char erroneously added there by my editor. That has happened before, and has caused truly difficult to find errors.

It even has happened that the said editor program has erroneously added in the source file non-printing characters which were not shown by the selfsame editor. It was necessary to examine the source file almost bit by bit in order to discover that bug. (The editor in question is the GNU Emacs.)

Therefore, the actual Shen issue here is, that the type checker (tc +) functionality should be more so to speak, fault-tolerant, and should in such cases report an error, and not crash at "maximum inferences exceeded".

Below comes another file with a similar mysterious error:


(synonyms nitems dict)

(define nN
    { nitems --> number }
    NI -> (<-dict NI number))

(define nS
    { nitems --> string }
    NI -> (<-dict NI string))

(define sN
    { nitems --> number --> nitems }
    NI X -> (dict-> NI number X))

(define sS
    { nitems --> string --> nitems }
    NI X -> (dict-> NI string X))

Thank you, you have pointed out the problem!

yours, AJY

tizoc commented 7 years ago

Therefore, the actual Shen issue here is, that the type checker (tc +) functionality should be more so to speak, fault-tolerant, and should in such cases report an error, and not crash at "maximum inferences exceeded".

There isn't much Shen can do there. Having the type system give the user so much power has its disadvantages, and what you experienced with your code is a perfect example. In other type systems such things cannot happen, but you can't express as much as you can express in Shen's type system (in terms of tradeoffs, those type systems sacrifice expressivity for safety, Shen's type system sacrifices safety to obtain more expressivity).

Regarding the code you pasted now, dict itself is not a type, (dict K V) is (where K is the type of keys, and V the type of values on that dict, for example (dict string (list number)) is the type of a dict that maps from strings to lists of numbers).

tizoc commented 7 years ago

... having said that, you may want to ask on the list about how to avoid such situations, maybe Mark or someone else may have some technique to share to make it easier to avoid such situations.