Clozure / ccl

Clozure Common Lisp
http://ccl.clozure.com
Apache License 2.0
855 stars 103 forks source link

virtual memory and static cons space #25

Open ragerdl opened 7 years ago

ragerdl commented 7 years ago

I hope this finds you all well in this time of transition.

Our compute cluster software needs a heuristic to determine when a process is using more RAM than it's been allocated. Since resident size is unreliable as a measure of memory a process is actually using once processes start swapping, the cluster uses virtual memory size as that heuristic. Unfortunately, this means that the compute cluster thinks CCL is using way more memory than it's actually using. We've mitigated this confusion with two mechanics: (1) using the -R flag to reduce heap size and (2) using a script Gary wrote for us to unmalloc what I think is a lot of the static cons space.

Could CCL supply a flag or compile-time option to its users that allows us to allocate significantly less static cons space? Another option, which I think would be more difficult to implement but still might be reasonable, would be to dynamically grow the static cons space as more space is needed (but never allowing the allocated static cons space to shrink). I've included the script GB wrote below the signature.

Thanks, David

(in-package "CCL")

(defun find-area (code)
  (let* ((p (%null-ptr)))
    (%setf-macptr-to-object p (%get-kernel-global all-areas))
    (if (eql (%get-object p target::area.code) code)
      p
      (do* ((q (%get-ptr p target::area.succ) (%get-ptr q target::area.succ)))
           ((eql p q))
        (when (eql (%get-object q target::area.code) code)
          (return q))))))

(defun dynamic-heap-start-address ()
  (%int-to-ptr (ash (%get-kernel-global heap-start) target::fixnumshift)))

(defun purified-end-address ()
  (let* ((a (find-area area-managed-static)))
    (if a
      (%get-ptr a target::area.high)
      (error "can't find the end of purified space"))))

(defun room-for-n-static-conses (n)
  (* (logandc2 (+ 32767 n) 32767) target::cons.size))

(defun target-dynamic-heap-start-address (nstatic-conses)
  (%inc-ptr
   (dynamic-heap-start-address)
   (- (room-for-n-static-conses (- nstatic-conses (reserved-static-conses))))))

(defun unmap-reserved-static-space (nstatic-conses)
  (let* ((start (purified-end-address))
         (end (target-dynamic-heap-start-address nstatic-conses))
         (nbytes (- (%ptr-to-int end) (%ptr-to-int start))))
    (#_munmap start nbytes)))

; for pairing with a heap-reserve of 142GB
; (ccl::unmap-reserved-static-space (expt 2 17))

; for pairing with a heap-reserve of 158GB, -Z 1024000000 --heap-reserve 158G
 (ccl::unmap-reserved-static-space (expt 2 16))

; for pairing with a heap-reserve of 190GB
; (ccl::unmap-reserved-static-space (expt 2 15)) 
--- Want to back this issue? **[Post a bounty on it!](https://www.bountysource.com/issues/42784672-virtual-memory-and-static-cons-space?utm_campaign=plugin&utm_content=tracker%2F27935804&utm_medium=issues&utm_source=github)** We accept bounties via [Bountysource](https://www.bountysource.com/?utm_campaign=plugin&utm_content=tracker%2F27935804&utm_medium=issues&utm_source=github).
xrme commented 7 years ago

Are you running the 1.11 release branch or the 1.12-dev branch? Or both? In other words, what is the output of (lisp-implementation-version)?

ragerdl commented 7 years ago

Happen to currently be on the 1.11 release branch, but I've recently updated the ACL2 stack (in particular, our inclusion of particular versions of quicklisp libraries) to work with the 1.12-dev. So, both. It probably makes the most sense to focus on the 1.12 development version. :)

xrme commented 7 years ago

Do you save images with static cons data? Or do you always create all static cons data as your program runs?

ragerdl commented 7 years ago

@mattkaufmann @solswords @kini do you know whether we recreate the hons tables after starting a save-exec'd image, or do we expect the hons'es to be saved and still there when we restart the image?

@xrme, I think the answers are "yes" and "no", respectively, but it's worth seeing what the above guys say

kini commented 7 years ago

I think the answers are "no" and "yes", actually. I poked through the call tree of save-exec and didn't see anything about hons tables; furthermore, comments in hons-raw.lisp seem to indicate that we rely on the behavior that static conses survive the process of saving and loading an image.

(Also, you probably meant to tag @MattKaufmann, who can likely give a more definitive answer.)

To @xrme: basically ACL2 provides its users with the ability to save images at any time, including after static conses have been allocated (which is the aforementioned save-exec function). @ragerdl thought it was possible that we had some code to deallocate static conses before letting the user save an image, and then some more code to reinstate the static conses after the image was loaded. I don't see any such code, so the direct answer to your question (I think) is "yes, we do save images with static cons data in them".

kini commented 7 years ago

Sorry, I think I misunderstood. @ragerdl's '"yes" and "no"' were probably aimed at @xrme's questions, not @ragerdl's own statements in the previous paragraph, in which case I agree. So to be clear:

At least, as far as I can tell.

MattKaufmann commented 7 years ago

It looks like I was mentioned. If you still need comments from me, let me know. I don't know the answers offhand and would have to investigate, so I only want to spend the time if you still care.

MattKaufmann commented 7 years ago

Regarding the question of whether we "save images with static cons data", I think the answer is yes, if I understand the question. That is: I have no recollection, and I see no evidence, that ACL2's save-exec utility frees any static conses to be rebuilt when the new executable is invoked.

The rest of this note is relevant information in case anyone wants to know more about this.

When ACL2 is built on CCL to do honsing using static conses (the default), a hons is simply a static cons. I've looked at relevant code, including the following:

- save-exec (ACL2's interface for saving an executable),
- save-exec-fn (executed by save-exec),
- save-exec-raw (called by save-exec),
- save-acl2-in-ccl (called by save-exec-raw),
- save-acl2-in-ccl-aux (called by save-acl2-in-ccl),
- acl2-default-restart (executed at startup), and (very briefly)
- lp (executed at startup).

Actually, acl2-default-restart explicitly avoids some hons initialization, as follows:

    #+hons (when (not produced-by-save-exec-p)
             (qfuncall acl2h-init))

However, I did find this in acl2-default-restart:

; Growing the sbits array just before si::save-system doesn't seem to avoid
; triggering a call of hl-hspace-grow-sbits when the first static hons is
; created.  So we do the grow here, i.e., after starting ACL2(h).
      #+(and hons static-hons)
      (hl-hspace-grow-sbits (hl-staticp (cons nil nil)) *default-hs*))

The function hl-hspace-grow-sbits messes with an "sbits" array that we use to track honses, but the actual static conses aren't touched.

The function save-acl2-in-ccl-aux concludes with this vanilla call of save-application:

(ccl:save-application core-name)

The log below may be irrelevant, but I include it just to show that static conses are somehow preserved. (I realize that it doesn't prove that they aren't re-created, though as argued above, I'm pretty sure that they're preserved, not re-created.)

~/temp/ccl-static$ acl2
Welcome to Clozure Common Lisp Version 1.12-dev-r16701M-trunk  (DarwinX8664)!

 ACL2 Version 7.4 built June 10, 2017  14:36:10.
 Copyright (C) 2017, Regents of the University of Texas
 ACL2 comes with ABSOLUTELY NO WARRANTY.  This is free software and you
 are welcome to redistribute it under certain conditions.  For details,
 see the LICENSE file distributed with ACL2.

 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 + WARNING: This is NOT an ACL2 release; it is a development snapshot  +
 + (git commit hash: b458e18d7782640d2c2ab04f577e786f2c14f43d).        +
 + The authors of ACL2 consider such distributions to be experimental; +
 + they may be incomplete, fragile, and unable to pass our own         +
 + regression tests.                                                   +
 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

 Initialized with (INITIALIZE-ACL2 'INCLUDE-BOOK *ACL2-PASS-2-FILES*).
 See the documentation topic note-7-4 for recent changes.
 Note: We have modified the prompt in some underlying Lisps to further
 distinguish it from the ACL2 prompt.

ACL2 Version 7.4.  Level 1.  Cbd "/Users/kaufmann/temp/ccl-static/".
System books directory "/Users/kaufmann/acl2/acl2/books/".
Type :help for help.
Type (good-bye) to quit completely out of ACL2.

ACL2 !>(assign a (hons 3 4))
 (3 . 4)
ACL2 !>(assign b (hons 5 6))
 (5 . 6)
ACL2 !>:q

Exiting the ACL2 read-eval-print loop.  To re-enter, execute (LP).
? (ccl::%staticp (f-get-global 'a *the-live-state*))
129
? (ccl::%staticp (f-get-global 'b *the-live-state*))
130
? (save-exec "foo" "temp image")
~/temp/ccl-static$ ./foo
Welcome to Clozure Common Lisp Version 1.12-dev-r16701M-trunk  (DarwinX8664)!

 ACL2 Version 7.4 built June 10, 2017  14:36:10
                   then June 12, 2017  09:18:15.
 Copyright (C) 2017, Regents of the University of Texas
 ACL2 comes with ABSOLUTELY NO WARRANTY.  This is free software and you
 are welcome to redistribute it under certain conditions.  For details,
 see the LICENSE file distributed with ACL2.

 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 + WARNING: This is NOT an ACL2 release; it is a development snapshot  +
 + (git commit hash: b458e18d7782640d2c2ab04f577e786f2c14f43d).        +
 + The authors of ACL2 consider such distributions to be experimental; +
 + they may be incomplete, fragile, and unable to pass our own         +
 + regression tests.                                                   +
 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

 Initialized with (INITIALIZE-ACL2 'INCLUDE-BOOK *ACL2-PASS-2-FILES*).
 See the documentation topic note-7-4 for recent changes.
 Note: We have modified the prompt in some underlying Lisps to further
 distinguish it from the ACL2 prompt.

MODIFICATION NOTICE:

temp image

ACL2 Version 7.4.  Level 1.  Cbd "/Users/kaufmann/temp/ccl-static/".
System books directory "/Users/kaufmann/acl2/acl2/books/".
Type :help for help.
Type (good-bye) to quit completely out of ACL2.

ACL2 !>:q

Exiting the ACL2 read-eval-print loop.  To re-enter, execute (LP).
? (ccl::%staticp (f-get-global 'a *the-live-state*))
129
? (ccl::%staticp (f-get-global 'b *the-live-state*))
130
? (quit)
~/temp/ccl-static$ 
xrme commented 7 years ago

Adding a run-time option like --static-cons-reserve 8g turns out to be not as straightforward as one would like.

This must be why Gary showed you how to upmap a hole between purified data and the static cons area, rather than simply implementing such a run-time option...

I'll write more about this later.