ashinn / chibi-scheme

Official chibi-scheme repository
Other
1.21k stars 142 forks source link

SRFI 231: array-copy is not call/cc-safe #904

Open gambiteer opened 1 year ago

gambiteer commented 1 year ago

I believe array-copy is implemented as array-copy!, re-entering a getter's continuation affects results that have already been returned.

Example:

heine:~/programs/chibi-scheme> cat test-cont1.scm
(import (scheme base)
        (scheme write)
        (srfi 231))

(define (pp arg) (display arg) (newline))

(pp 'array-copy)

(let* ((cont #f)
       (call-cont #t)
       (domain (make-interval '#(2 2)))
       (A_ (lambda (i j)
             (call-with-current-continuation
              (lambda (c)
                (if (= i j 0)
                    (set! cont c))
                1))))
       (A (make-array domain A_))
       (array-list '()))
  (let ((temp (array-copy A)))
    (set! array-list (cons temp array-list)))
  (pp 'printing)
  (for-each (lambda (A) (pp (array->list* A))) array-list)
  (if call-cont
      (begin
        (set! call-cont #f)
        (cont 4)))
  (pp (apply eq? array-list)))

Chibi returns

heine:~/programs/chibi-scheme> env LD_LIBRARY_PATH=/usr/local/chibi/lib /usr/local/chibi/bin/chibi-scheme test-cont1.scm
array-copy
printing
((1 1) (1 1))
printing
((4 1) (1 1))
((4 1) (1 1))
#t

The sample implementation added to Gambit returns

> heine:~/programs/chibi-scheme> gsi test-cont1.scm
array-copy
printing
((1 1) (1 1))
printing
((4 1) (1 1))
((1 1) (1 1))
#f
ashinn commented 1 year ago

It makes sense for iterators and higher-order functions to be call/cc safe, because they are generic and should compose with continuations.

This is not the case for array-copy, where the only problem arises if the element getter itself uses continuations. It's difficult to think of a non-contrived example of this. Moreover the proper implementation is prohibitively inefficient. The entire array needs to be converted to a list and then back to a new array. It's like we've thrown out the baby with the bathwater - we've solved one problem that will never happen and introduced serious practical limitations on array sizes. I realize array-copy! doesn't have this problem but people are more likely to just use array-copy without awareness of the danger.

(In either case the ! suffix seems misleading - we are not mutating or taking ownership of any of the arguments. I'm not sure how to indicate this though.)

gambiteer commented 1 year ago

(array-map f array) returns a generalized array, with the new getter being the composition of f and (array-getter array), which may later be manipulated by array-copy, array-fold-left, etc. So array-copy, etc., implicitly must deal with things that look like higher-order functions.

There was a lot of discussion (scores of emails) in July--September 2022 on the SRFI 231 mail list about these things, and Marc Nieper-Wißkirchen argued strongly, and convinced me to change a lot of code, to make the default procedures be "call/cc safe".

I suppose the current discussion could move to the SRFI 231 mail list.

Edit: The sample implementation converts generalized array arguments to array-copy, array-stack, etc., to specialized arrays in a call/cc-safe way; it leaves specialized array arguments alone, as a specialized array's getter (which the implementation has complete control over) does not call call/cc. So it tries not to be too inefficient.

mnieper commented 1 year ago

Continuations can implicitly appear in the getter if, say, the programmer uses a construct like the amb operator. As Scheme has first-class continuations, the general procedure should work with it. There are the procedures ending with ! where no compromises must be made with respect to efficiency.

What "call/cc" actually means is that (using continuations) one cannot detect mutation of previously allocated locations. Thus the suffix "!" seems correct to me.

ashinn commented 1 year ago

What "call/cc [safety]" actually means is that (using continuations) one cannot detect mutation of previously allocated locations.

Thanks for the definition. It's important to be precise in specifying new concepts. We should be a little more precise - what we want to detect is mutations made by the procedure itself to allocations it made itself, and not by any of its arguments or the continuations involved.

R7RS map is not guaranteed to have this property, though the "natural" implementation does.

R7RS vector-map is neither guaranteed nor likely to have this property.

What's so special about arrays that they require this extra protection not available in the rest of the language?

mnieper commented 1 year ago

The definitions of map and vector-map in R6RS and R7RS given in prose do not speak of any mutations of previously (wrt the mutation) allocated locations. Instead, what they say is that elements are generated (in some unspecified order), and only then a vector of the elements is (created and) returned.

Now saying that just because they are not forced to mutate doesn't mean that they are not allowed to would be the same as saying that just because + is not forced to print "Hello World!", doesn't mean it is not allowed to.

(If map and vector-map didn't have this property, they wouldn't work well together with other elements of the Scheme programming languages, namely first-class continuations, and this would go against basic principles of Scheme about unnecessary restrictions.)

This was already discussed outside of the context of SRFI 231 with Marc Feeley who also gave an efficient implementation of vector-map.

ashinn commented 1 year ago

I'm curious as to what this efficient implementation is. The implementation in SRFI 133 is not call/cc safe: https://github.com/scheme-requests-for-implementation/srfi-133/blob/master/vectors/vectors-impl.scm#L823

mnieper commented 1 year ago

Marc Feeley gives the following implementation of vector-map (which can be found in Gambit), restricted to the one-vector case for simplicity:

 (define (vector-map-1 proc input-vect)

   (define (vmap-1 proc input-vect i)
     (if (< i (vector-length input-vect))
         (let* ((result (proc (vector-ref input-vect i)))
                (output-vect (vmap-1 proc input-vect (+ i 1))))
           (vector-set! output-vect i result)
           output-vect)
         (make-vector i)))

   (vmap-1 proc input-vect 0))

There is no GC pressure because the stack is used to build the list of intermediate values. It is a very clever solution in my opinion.

All credit goes to Marc; he was the one who called attention to this issue.

Fixing the implementations and making the specifications of SRFI 133 and some other SRFIs is on the TODO list for the large language.

Efficiency (or the lack thereof) is not really an argument, in my opinion; first of all, an optimizing implementation will have to inline vector-map in tight loops anyway to bring it on par with a hand-written do loop, and then there is vector-map!, which one can use instead.

ashinn commented 1 year ago

This probably performs well in benchmarks for small vectors, but you risk blowing the stack. Building an intermediate list is safer, but I realized we can just copy the vector at the end so the mutated locations are never seen:

(define (vector-map proc vec)
  (let ((res (make-vector (vector-length vec))))
    (do ((i (- (vector-length vec) 1) (- i 1)))
        ((negative? i) (vector-copy res))
      (vector-set! res i (proc (vector-ref vec i))))))

Now if this is inlined and proc is trivial (like fx+ a constant) it's only twice as slow as the non-call/cc safe version instead of pathologically slow.

Efficiency (both in time and space) is not everything but it matters. That's why we have TCO.

My primary use case for SRFI 231 is neural networks. If we have something as simple as:

(array-copy (array-map + a b))

where a and b are large f8 arrays, the space usage is at least 16x as much for the current SRFI 231 reference implementation, and twice as much using my approach. Being limited to half your available memory is not good.

gambiteer commented 1 year ago

This probably performs well in benchmarks for small vectors, but you risk blowing the stack.

I compared the speed of the current sample implementation of the SRFI and also a multi-dimensional analogue of Feeley's method for vector-map. On my machine, the current implementation, using an intermediate list, was faster than Feeley's technique, and it was useful for other purposes (array->list and array->list*), so I used it. On Gambit, the stack expands until it hits the heap, which starts a GC.

Building an intermediate list is safer, but I realized we can just copy the vector at the end so the mutated locations are never seen:

(define (vector-map proc vec)
  (let ((res (make-vector (vector-length vec))))
    (do ((i (- (vector-length vec) 1) (- i 1)))
        ((negative? i) (vector-copy res))
      (vector-set! res i (proc (vector-ref vec i))))))

I previously thought about this implementation and concluded one can discover that mutation occurred by capturing two continuations, for both i < j, and one can tell whether the ith continuation was invoked before the jth. With enough time I may be able to reconstruct the example.

Now if this is inlined and proc is trivial (like fx+ a constant) it's only twice as slow as the non-call/cc safe version instead of pathologically slow.

Efficiency (both in time and space) is not everything but it matters. That's why we have TCO.

My primary use case for SRFI 231 is neural networks. If we have something as simple as:

(array-copy (array-map + a b))

where a and b are large f8 arrays, the space usage is at least 16x as much for the current SRFI 231 reference implementation, and twice as much using my approach. Being limited to half your available memory is not good.

Gambit doesn't support any flonum operations other than IEEE double (although it supports "native" f32vectors). Gambit doesn't keep flonums in registers across basic block boundaries, so (array-copy! (array-map + a b)) (not array-copy) will box three doubles (one each when accessing elements from a and b, and one for the result), each one 16 bytes, so 48 bytes total. (At least, that's the approach the current f16 support takes.)

array-copy will add a cons cell for each element, so another 24 bytes on a 64-bit machine (12 bytes on a 32-bit machine). This cons cell will exist for the period that the list exists, as well as one 16-byte boxed flonum.

Then you need another byte for the actual storage in the f8 array.

So array-copy has less than 50% overhead in this example assuming a particular implementation.

I see this SRFI as useful for prototyping, algorithm design, and for actually slicing and dicing arrays in production code.

Production code operations on large arrays may be best done using an FFI.

If this brief note doesn't address your comments, I apologize.

ashinn commented 1 year ago

Sorry, I meant (array-copy (array-map + a b) f8-storage-class). There will be some GC pressure in either case for boxing and unboxing the floats, but the live temporary storage using your calculations will be 1 float + 1 cons cell per element (4 words or 32 bytes on a 64-bit machine) for array-copy and 0 for array-copy!. The result size will be 1 byte per element.

So array-copy here actually uses 49x more storage, not 50%. That's huge.

To be clear, I'm not arguing against providing the call/cc safe versions, I just think they are an unusual novelty and should not be the default for arrays, but I should probably take that discussion to the SRFI list.

gambiteer commented 1 year ago

Here's an experiment with 10,000 x 10,000 arrays:

heine:~> gsi
Gambit v4.9.4-176-g2a0917c6

> (import (srfi 231))
> (define domain (make-interval '#(10000 10000)))
> (define a (make-specialized-array domain f16-storage-class 1.))
> (define b (make-specialized-array domain f16-storage-class 2.))
> (define c (time (array-copy! (array-map + a b) f16-storage-class)))
(time (array-copy! (array-map + a b) f16-storage-class))
    9.003968 secs real time
    8.998404 secs cpu time (8.836191 user, 0.162213 system)
    16 collections accounting for 0.009432 secs real time (0.006439 user, 0.003015 system)
    9800001296 bytes allocated
    195328 minor faults
    1 major fault
    32341340574 cpu cycles
> (define c (time (array-copy (array-map + a b) f16-storage-class))) 
(time (array-copy (array-map + a b) f16-storage-class))
    14.549698 secs real time
    14.536582 secs cpu time (13.037946 user, 1.498636 system)
    6 collections accounting for 4.884961 secs real time (4.544420 user, 0.335090 system)
    14600000960 bytes allocated
    2048142 minor faults
    no major faults
    52268853249 cpu cycles
> (define d (time (array-copy! (array-map + a b) f16-storage-class)))
(time (array-copy! (array-map + a b) f16-storage-class))
    10.959820 secs real time
    10.957464 secs cpu time (10.439781 user, 0.517683 system)
    10 collections accounting for 1.720386 secs real time (1.360254 user, 0.359752 system)
    9600000624 bytes allocated
    313173 minor faults
    no major faults
    39378044310 cpu cycles
> (define c (time (array-copy (array-map fl+ a b) f16-storage-class)))
(time (array-copy (array-map fl+ a b) f16-storage-class))
    13.575337 secs real time
    13.567943 secs cpu time (12.204591 user, 1.363352 system)
    5 collections accounting for 4.030629 secs real time (3.747765 user, 0.279555 system)
    14600000960 bytes allocated
    2026152 minor faults
    no major faults
    48768363077 cpu cycles
> (define d (time (array-copy! (array-map fl+ a b) f16-storage-class)))
(time (array-copy! (array-map fl+ a b) f16-storage-class))
    10.028667 secs real time
    10.023165 secs cpu time (9.207294 user, 0.815871 system)
    5 collections accounting for 0.472502 secs real time (0.155962 user, 0.315847 system)
    9600000624 bytes allocated
    635909 minor faults
    no major faults
    36028276853 cpu cycles

That's a 50% premium in CPU time. I don't know what else to say.

ashinn commented 1 year ago

I'm mostly talking about space, not time. If I want to use SRFI 231 like I use Pandas, I simply can't use array-copy, I won't have enough memory. I have to litter my code with !s everywhere even though I'm not using mutation.

ashinn commented 1 year ago

To be clear, if I'm using f8 or f16 it's because I have a lot of data, and am generally talking about 100s of millions of elements, not 10,000, and I'd use billions if I had more memory.

gambiteer commented 1 year ago

The example used 100,000,000 elements.

Here's an example using u16s, so no boxing of intermediate values and no conversion routines:

> (define A (time (make-specialized-array domain u16-storage-class 1)))
(time (make-specialized-array domain u16-storage-class 1))
    0.086659 secs real time
    0.086585 secs cpu time (0.041063 user, 0.045522 system)
    no collections
    200000448 bytes allocated
    48829 minor faults
    1 major fault
    311235541 cpu cycles
> (define B (time (make-specialized-array domain u16-storage-class 2)))
(time (make-specialized-array domain u16-storage-class 2))
    0.101584 secs real time
    0.101420 secs cpu time (0.061653 user, 0.039767 system)
    1 collection accounting for 0.018652 secs real time (0.006745 user, 0.011883 system)
    -16 bytes allocated
    49049 minor faults
    no major faults
    364839928 cpu cycles
> (define C (time (array-copy (array-map fx+ A B) u16-storage-class)))
(time (array-copy (array-map fx+ A B) u16-storage-class))
    5.493697 secs real time
    5.489738 secs cpu time (4.882070 user, 0.607668 system)
    2 collections accounting for 1.354264 secs real time (1.125221 user, 0.227984 system)
    5000000928 bytes allocated
    824737 minor faults
    no major faults
    19733075820 cpu cycles
> (define D (time (array-copy! (array-map fx+ A B) u16-storage-class)))
(time (array-copy! (array-map fx+ A B) u16-storage-class))
    3.939890 secs real time
    3.938633 secs cpu time (3.902500 user, 0.036133 system)
    no collections
    200001056 bytes allocated
    48829 minor faults
    no major faults
    14151177527 cpu cycles
> (/ 5000000928. 200001056)
24.999872640672457

So 25 times the storage allocated, and a 50% time cost.

ashinn commented 1 year ago

Oh, sorry, yes I guess I can handle 100 million elements on my machine, but the argument is the same, it's just a question of where the limit is. On a more limited machine, or at billions or tens of billions array-copy! is the only usable option.

mnieper commented 1 year ago

The best implementation strategy for procedures like vector-map depends undoubtedly on the system. Gambit's version assumes a fast stack, and that stack memory is virtually unlimited (as the general heap).

A fast and memory-efficient implementation strategy that should work for all implementations is to set a dynamic flag when a procedure like vector-map is entered and then start with the impure, most memory-efficient version. When call/cc is evaluated, the system checks the flag and, if it is present, calls a handler (which would be installed by vector-map). That handler would switch to a pure version of vector-map and proceed with capturing the continuation. (This needs access to the system internals of call/cc much like an implementation of dynamic-wind does.) A more fine-grained algorithm for systems with delimited continuations would be to add a continuation mark (instead of setting a dynamic flag) and to invoke the handler only if the stack segment captured to be captured by the continuation will contain the continuation mark.

(I would like to hear your suggestions about an API that would make such things possible in portable code.)

Until this is implemented, what is the problem of using array-copy! instead of array-copy when you don't need the purity guarantees of array-copy?

gambiteer commented 1 year ago

After doing more experiments, looking at space and time usage, and thinking about the different kind of applications an array SRFI might support, I'm more comfortable with array-copy, etc., being the default (as in not requiring some special ! notation) procedures.

I'd be OK with people processing arrays so large that call/cc-safe procedures will require memory too large to fit in their machines needing to look up the available options of the non-call/cc-safe procedures to run their applications.

Just my opinion.

gambiteer commented 1 year ago

Here's a comparison between the two proposed implementations of vector-map, which give different answers. To be honest, I don't know which approach is "correct".

heine:~/lang/scheme/srfi-231/srfi-231-temp> cat vector-map-test.scm
(let* ((cont1 #f)
       (cont2 #f)
       (i 3)
       (f (lambda (x)
            (call-with-current-continuation
             (lambda (c)
               (case x
                 ((1) (set! cont1 c))
                 ((3) (set! cont2 c))
                 (else #f))
               x))))
       (vector-map
        (lambda (proc vec)
          (let ((res (make-vector (vector-length vec))))
            (do ((i (- (vector-length vec) 1) (- i 1)))
                ((negative? i) (vector-copy res))
              (vector-set! res i (proc (vector-ref vec i)))))))
       (arg '#(0 1 2 3))
       (result-list '()))
  (let ((temp (vector-map f arg)))
    (set! result-list (cons temp result-list)))
  (case i
    ((3) (set! i (- i 1)) (cont2 10))
    ((2) (set! i (- i 1)) (cont1 20))
    (else #t))
  (pp result-list))

(let* ((cont1 #f)
       (cont2 #f)
       (i 3)
       (f (lambda (x)
            (call-with-current-continuation
             (lambda (c)
               (case x
                 ((1) (set! cont1 c))
                 ((3) (set! cont2 c))
                 (else #f))
               x))))
       (vector-map
        (lambda (proc input-vect)

          (define (vmap-1 proc input-vect i)
            (if (< i (vector-length input-vect))
                (let* ((result (proc (vector-ref input-vect i)))
                       (output-vect (vmap-1 proc input-vect (+ i 1))))
                  (vector-set! output-vect i result)
                  output-vect)
                (make-vector i)))

          (vmap-1 proc input-vect 0)))
       (arg '#(0 1 2 3))
       (result-list '()))
  (let ((temp (vector-map f arg)))
    (set! result-list (cons temp result-list)))
  (case i
    ((3) (set! i (- i 1)) (cont2 10))
    ((2) (set! i (- i 1)) (cont1 20))
    (else #t))
  (pp result-list))
heine:~/lang/scheme/srfi-231/srfi-231-temp> gsi vector-map-test.scm
(#(0 20 2 10) #(0 1 2 10) #(0 1 2 3))
(#(0 20 2 3) #(0 1 2 10) #(0 1 2 3))
mnieper commented 1 year ago

For each continuation captured, note the elements that have already been generated with this continuation. When this continuation is later reinstated, these elements must not be generated anew and must appear in the result of vector-map (in this continuation).

(This is what the prose of the definition of vector-map says; continuations in prose are expressed by "then" or even more implicit temporal ordering; mutation of locations, on the other hand, are always explicitly noted in the reports.)

The "Feeley" version of vector-map obviously fulfills the condition given in the first paragraph because a captured continuations will also include all the elements previously generated (and the information on which elements have already been generated) because this information is in the continuation frames captured.

APIPLM commented 1 year ago

vector-map is so rough in the related array SRFIs.I would rather think that it is the operation between array and array, like the matrix plus, multiplication.and then discuss about how call/cc is applied for the safety reason.

gambiteer commented 3 months ago

Just as a record, I include here test code for interactions of continuations and the various "continuation-safe" array constructors.

These pass on Gambit's implementation and fail on Chibi's. I understand that it's not a priority to change Chibi's implementation to make these pass, but I want to put the tests here all the same.

(test-group
 "interaction of array creators and continuations"

 (let* ((cont #f)
        (call-cont #t)
        (domain (make-interval '#(2 2)))
        (A_ (lambda (i j)
              (call-with-current-continuation
               (lambda (c)
                 (if (= i j 0)
                     (set! cont c))
                 1))))
        (A (make-array domain A_))
        (array-list '()))
   (let ((temp (array-copy A)))
     (set! array-list (cons temp array-list)))
   (if call-cont
       (begin
         (set! call-cont #f)
         (cont 4)))
   (for-each (lambda (result truth)
               (test truth
                     (array->list* result)))
             array-list
             '(((4 1) (1 1))
               ((1 1) (1 1)))))

 (let* ((cont #f)
        (call-cont #t)
        (domain (make-interval '#(2 2)))
        (B (list*->array 2 '((1 2) (3 4))))
        (A_ (lambda (i j)
              (call-with-current-continuation
               (lambda (c)
                 (if (= i j 0)
                     (set! cont c))
                 1))))
        (A (make-array domain A_))
        (array-list '()))
   (let ((temp (array-append 1 (list A B))))
     (set! array-list (cons temp array-list)))
   (if call-cont
       (begin
         (set! call-cont #f)
         (cont 4)))
   (for-each (lambda (result truth)
               (test truth
                     (array->list* result)))
             array-list
             '(((4 1 1 2) (1 1 3 4))
               ((1 1 1 2) (1 1 3 4)))))

 (let* ((cont #f)
        (call-cont #t)
        (domain (make-interval '#(2 2)))
        (B (list*->array 2 '((1 2) (3 4))))
        (A_ (lambda (i j)
              (call-with-current-continuation
               (lambda (c)
                 (if (= i j 0)
                     (set! cont c))
                 1))))
        (A (make-array domain A_))
        (array-list '()))
   (let ((temp (array-stack 1 (list A B))))
     (set! array-list (cons temp array-list)))
   (if call-cont
       (begin
         (set! call-cont #f)
         (cont 4)))
   (for-each (lambda (result truth)
               (test truth
                     (array->list* result)))
             array-list
             '((((4 1) (1 2)) ((1 1) (3 4)))
               (((1 1) (1 2)) ((1 1) (3 4))))))

 (let* ((cont #f)
        (call-cont #t)
        (domain (make-interval '#(2 2)))
        (B (list*->array 2 '((1 2) (3 4))))
        (A_ (lambda (i j)
              (call-with-current-continuation
               (lambda (c)
                 (if (= i j 0)
                     (set! cont c))
                 1))))
        (A (make-array domain A_))
        (C (list*->array 2 (list (list A B))))
        (array-list '()))
   (let ((temp (array-block C)))
     (set! array-list (cons temp array-list)))
   (if call-cont
       (begin
         (set! call-cont #f)
         (cont 4)))
   (for-each (lambda (result truth)
               (test truth
                     (array->list* result)))
             array-list
             '(((4 1 1 2) (1 1 3 4))
               ((1 1 1 2) (1 1 3 4)))))

 (let* ((cont #f)
        (call-cont #t)
        (domain (make-interval '#(2 2)))
        (B (list*->array 2 '((1 2) (3 4))))
        (A_ (lambda (i j)
              (call-with-current-continuation
               (lambda (c)
                 (if (= i j 0)
                     (set! cont c))
                 1))))
        (A (make-array domain A_))
        (C (list*->array 1 (list A B)))
        (array-list '()))
   (let ((temp (array-decurry C)))
     (set! array-list (cons temp array-list)))
   (if call-cont
       (begin
         (set! call-cont #f)
         (cont 4)))
   (for-each (lambda (result truth)
               (test truth
                     (array->list* result)))
             array-list
             '((((4 1) (1 1)) ((1 2) (3 4)))
               (((1 1) (1 1)) ((1 2) (3 4))))))

 (let* ((cont #f)
        (call-cont #t)
        (domain (make-interval '#(2 2)))
        (A_ (lambda (i j)
              (call-with-current-continuation
               (lambda (c)
                 (if (= i j 0)
                     (set! cont c))
                 1))))
        (A (make-array domain A_))
        (array-list '()))
   (let ((temp (array-copy! A)))
     (set! array-list (cons temp array-list)))
   (if call-cont
       (begin
         (set! call-cont #f)
         (cont 4))
       (test-assert array-list)))

 (let* ((cont #f)
        (call-cont #t)
        (domain (make-interval '#(2 2)))
        (B (list*->array 2 '((1 2) (3 4))))
        (A_ (lambda (i j)
              (call-with-current-continuation
               (lambda (c)
                 (if (= i j 0)
                     (set! cont c))
                 1))))
        (A (make-array domain A_))
        (array-list '()))
   (let ((temp (array-append! 1 (list A B))))
     (set! array-list (cons temp array-list)))
   (if call-cont
       (begin
         (set! call-cont #f)
         (cont 4))
       (test-assert array-list)))

 ;; And now test that at least screwing around with continuations
 ;; doesn't crash the ! procedures.

 (let* ((cont #f)
        (call-cont #t)
        (domain (make-interval '#(2 2)))
        (B (list*->array 2 '((1 2) (3 4))))
        (A_ (lambda (i j)
              (call-with-current-continuation
               (lambda (c)
                 (if (= i j 0)
                     (set! cont c))
                 1))))
        (A (make-array domain A_))
        (array-list '()))
   (let ((temp (array-stack! 1 (list A B))))
     (set! array-list (cons temp array-list)))
   (if call-cont
       (begin
         (set! call-cont #f)
         (cont 4))
       (test-assert array-list)))

 (let* ((cont #f)
        (call-cont #t)
        (domain (make-interval '#(2 2)))
        (B (list*->array 2 '((1 2) (3 4))))
        (A_ (lambda (i j)
              (call-with-current-continuation
               (lambda (c)
                 (if (= i j 0)
                     (set! cont c))
                 1))))
        (A (make-array domain A_))
        (C (list*->array 2 (list (list A B))))
        (array-list '()))
   (let ((temp (array-block! C)))
     (set! array-list (cons temp array-list)))
   (if call-cont
       (begin
         (set! call-cont #f)
         (cont 4))
       (test-assert array-list)))

 (let* ((cont #f)
        (call-cont #t)
        (domain (make-interval '#(2 2)))
        (B (list*->array 2 '((1 2) (3 4))))
        (A_ (lambda (i j)
              (call-with-current-continuation
               (lambda (c)
                 (if (= i j 0)
                     (set! cont c))
                 1))))
        (A (make-array domain A_))
        (C (list*->array 1 (list A B)))
        (array-list '()))
   (let ((temp (array-decurry! C)))
     (set! array-list (cons temp array-list)))
   (if call-cont
       (begin
         (set! call-cont #f)
         (cont 4))
       (test-assert array-list)))

 (let* ((cont #f)
        (call-cont #t)
        (domain (make-interval '#(2 2)))
        (B (list*->array 2 '((1 2) (3 4))))
        (A_ (lambda (i j)
              (call-with-current-continuation
               (lambda (c)
                 (if (= i j 0)
                     (set! cont c))
                 1))))
        (A (make-array domain A_)))
   (array-assign! B A)
   (if call-cont
       (begin
         (set! call-cont #f)
         (cont 4))
       (test-assert B)))

)