phoe-trash / value-semantics-utils

4 stars 2 forks source link

linear cycle comparision #3

Closed Plisp closed 2 years ago

Plisp commented 2 years ago

This relies on the fact that if L1, L2 are circular lists with minimal cycles x, y respectively, (equal x y) <=> (eqv L1 L2)

(defun circle-list-length (list)
  (loop :with first-cons = list
        :for this-cons :on (cdr list)
        :for i :from 1
        :until (eq this-cons first-cons)
        :finally (return i)))

(defun smallest-cycle (list list-length)
  (loop :initially (or list (return 0))
        :repeat list-length
        :with states = (list) ; (length . current-phase)
        :for i :from 0
        :for this-cons = list :then (cdr this-cons)
        :for (val . next-cons) = this-cons
        :do (loop :initially (princ states) (terpri)
                  :with new-states ; prune values that no longer match
                  :for state :in states
                  :for (length . phase) = state
                  :do (setf phase (mod (1+ phase) length))
                      (when (= val (nth phase list))
                        (push (cons length phase) new-states))
                  :finally (setf states new-states))
            (when (= val (first list))
              (if (zerop i) ; can probably tweak order to avoid this
                  (push (cons list-length 0) states)
                  (push (cons i 0) states)))
        :finally (mapcar (lambda (c) (incf (cdr c))) states)
                 (let ((valid-cycles (loop :for state :in states
                                           :for (length . phase) = state
                                           :when (= phase length)
                                             :collect length)))
                   (princ states) (terpri)
                   (princ valid-cycles) (terpri)
                   (return (subseq list 0 (first (sort valid-cycles #'<)))))))

(defun list-eqv (l1 l2)
  (equal (smallest-cycle l1 (circle-list-length l1))
         (smallest-cycle l2 (circle-list-length l2))))

(require 'alexandria)
(assert (list-eqv '#1=(1 2 1 2 3 1 2 1 2 3 . #1#) ; shortest cycle (1 2 1 2 3)
                  '#2=(1 2 1 2 3 . #2#)))

(assert (list-eqv (alexandria:make-circular-list 99 :initial-element 1)
                  (alexandria:make-circular-list 100 :initial-element 1)))

proof of (equal x y) => (eqv L1 L2): they're equal within the circular-list-lengths, being repetitions of x and so they're equal everywhere proof of (eqv L1 L2) => (equal x y): if L1, L2 are eqv but their minimal cycles x, y have different lengths (if they have the same length they obviously must be equal or the first (length x) elements of L1, L2 don't match), then considering equalities over the first repeating LCM(x,y) elements reveals that GCD(x,y) is a shorter cycle, contradicting their minimality. (TODO needs some rigor) So (eqv L1 L2) iff (equal x y)

Plisp commented 2 years ago

Detail of part 2: Assuming the lengths of x,y are m,n, m < n, let x: a1 a2 a3 ... am y: b1 b2 b3 ... bn let gcd(m,n) = k Then considering the elements of the identical lists L1,L2 taking indices modulo m and n we have the equalities: a1 = b1 = a(1+m) = b(1+m) = b(1 + 2m)... ... ak = bk = b(k+m) = b(k + 2m)...

Let m = pk, n = qk with p,q coprime by definition Now a(k+1) = b(k+1 + cm) for any positive integer c, and in particular there exists a c such that k+1 + cm = 1 (mod n) as k + cm = 0 (mod n) certainly has a solution (there certainly there exists c such that 1+cp = q or cp = -1 (mod q) as all remainders cp are distinct mod q).

Then a(k+1) = b(k+1 + cm) = b1 = b(1+m) = b(1 + 2m)... and a similar argument applies to a(k+2) through a(k+k).

Hence L1 in fact has a cycle of period k < m, contradicting the minimality of m. This implies that x, y have the same length and therefore should be identical lists (as above).

phoe commented 2 years ago

The main issue I see with circle-list-length are structures like (1 2 3 4 . #1=(5 6 7 . #1#)) - we cannot know ahead of time if a data structure is part of some cycle, and therefore this function will hang on such input. We need to memoize all nodes just for the possibility of them being a part of some cycle that we are not yet aware of.

smallest-cycle also needs to run on a data structure that is confirmed to be a cycle - so, we first need to find a cycle at all (e.g. by noticing that a node has been visited for the second time in a row and remembering the path we used to reach it) in order to be able to minimize it.