phoe-trash / value-semantics-utils

4 stars 2 forks source link

Utilities to update a set of object while maximising structual sharing #4

Open fstamour opened 2 years ago

fstamour commented 2 years ago

From discord (discord link):

So @phoe, I was reading stuff from https://github.com/phoe-trash/value-semantics-utils, and it got me thinking.

For the lisp reader (that I keep talking about), I was on the fence on whether keep using classes, or using structs, or going with functional data structures for representing the syntax tree. They all got advantages and disadvantages.

I was wondering if you had an idea on how to represent a tree with classes, but keeping a "very functional" behaviour... For example, it really is easier (for me at least) to do mutation when trying to modify something deep in the tree, but it's hard to do so while "maximizing" the sharing. Said otherwise, I don't want to copy the whole tree for every variations of the tree.

I think your object-with-value-semantics could be a pretty good start/lead.

Any comments/idea/rant on that wall of text?


(cl:in-package #:common-lisp-user)

#+ (or)
(ql:quickload '(#:value-semantics-utils))
;; ^^^ Failed, can't find `with-macroexpand-time-branching`

(defpackage #:functional-classes
  (:use #:cl))

(in-package #:functional-classes)

;; So, let's say I have this tree that I want to modify

(defparameter *tree*
  (copy-tree ; because mutating a literal is not ideal
   `(defpackage #:functional-classes
      (:use #:cl))))

;; Let's say I wan't to add something, but keep the original intact.
;; The naive way is just to copy the whole thing and do the mutation.

(defun be-paranoid (a b)
  "Signals an error when A and B are not equalp."
  (unless (equalp a b)
    (error "What!? these should be the equivalent!")))

(defparameter *new-tree*
  (let ((new-tree (copy-tree *tree*)))
    (be-paranoid new-tree *tree*)
    (setf (cdddr new-tree) `((:local-nicknames (:a :alexandria))))
    new-tree))
#+ (or)
(DEFPACKAGE #:FUNCTIONAL-CLASSES
  (:USE #:CL)
  (:LOCAL-NICKNAMES (:A :ALEXANDRIA)))

;; But this method (copying everything) fails to take advantage of
;; possible strucutral sharing.
;;
;; For example, both trees have equivalent third element.
(third *tree*)
;; => (:USE #:CL)
(equalp (nth 2 *tree*)
    (nth 2 *new-tree*))
;; => T

;; But they aren't actually the same
(eq (nth 2 *tree*)
    (nth 2 *new-tree*))
;; => NIL

;; Concluion: it would be nice to have some utilies to copy just what
;; needs to be copied.

;;; The problem is worst with nested objects as we need some method
;;; for deep copying the tree...
;;;
;;; See https://codereview.stackexchange.com/q/156392 for a possible
;;; implementation of a generic deep-copy.

(use-package :breeze.reader)
(use-package :breeze.refactor)

(parse-string
 "(defpackage #:functional-classes
      (:use #:cl))")
#|
(#<LIST-NODE (#<SYMBOL-NODE "(" "defpackage">
                #<SYMBOL-NODE " " "#:functional-classes">
                #<LIST-NODE "
      " (#<SYMBOL-NODE "(" ":use">
               #<SYMBOL-NODE " " "#:cl">) :raw "(:use #:cl)">) :raw "(defpackage #:functional-classes
      (:use #:cl))">)
|#

;;; Can't even use copy-tree in this case...

From @phoe on discord (html link discord link) - Heavily paraphrased:

Make a utility that, given a root object for accessing a graph and a list of objects that you'd like to mutate, returns a new root with just enough copying™, and references to these new objects so you can immediately mutate them.

phoe commented 2 years ago

Should be possible to perform this "what do we copy?" algorithm in O(n) time with two walks in total.

  1. DFS-walk the graph.
    • Start with a graph whose all nodes are marked white.
    • Remember the nodes along the path, starting from the root object.
    • As soon as we hit an object to be mutated, mark all nodes between the root node and the object as black.
    • As soon as we exhaust the list of all to-be-mutated objects, stop the walk since there is nothing left to do.
  2. (only if cycle detection is true) DFS-walk the graph once more.
    • Start with a graph whose root node is marked black (as a result of walk 1).
    • As soon as we hit a white node, start remembering the path.
    • As soon as we hit a white node that we have visited before, backtrack.
    • As soon as we hit a black node, mark all nodes along the path as black and backtrack.

At the end of this algorithm:

I do not yet know if that algorithm is correct.

phoe commented 2 years ago

@marcoheisig comes to help! Thanks.

(defun supersubst (graph substitutions)
  (declare (hash-table substitutions))
  (let ((parent-table (make-hash-table))
        (color-table (make-hash-table)))
    (macrolet ((parents (node) `(gethash ,node parent-table '()))
               (color (node) `(gethash ,node color-table :white)))
      ;; Determine the parents of each node.
      (do-graph-nodes (node graph)
        (do-node-children (child node)
          (push node (parents child))))
      ;; Mark nodes that must be copied as black.
      (maphash
       (lambda (old new)
         (let ((worklist (list old)))
           (loop until (null worklist) for node = (pop worklist) do
             (unless (eq (color node) :black)
               (setf (color node) :black)
               (dolist (parent (parents node))
                 (push parent worklist))))))
       substitutions)
      ;; TODO Actually copy the graph.
      )))

TODO: old in the hash table should be a function of some sort, because we cannot locate nodes just by eq. Something along these lines, bla bla structure sharing. I need to concretify it later.

phoe commented 2 years ago

this algorithm needs a bit more more information! and I think I'll figure it out let's consider our favorite diamond, A → B → D, A → C → D and I'd like to only modify the node that B points to so this becomes a non-diamond, A' → B' → D', A' → C → D but I can't simply mark D as black because that would unnecessarily copy C so I can't really use eq for nodes, I think need to use parent+edge combo to identify which nodes to copy and that's doable