sagemath / sage

Main repository of SageMath
https://www.sagemath.org
Other
1.45k stars 481 forks source link

Let PARI handle its own stack #19883

Closed jdemeyer closed 8 years ago

jdemeyer commented 8 years ago

For historical reasons, the PARI stack is actually handled "manually" by Sage. The main reason was to implement auto-resizing of the stack in case a command failed due to a stack overflow. With the introduction of parisizemax, PARI can handle this on its own.

A new patch (submitted to pari-dev) is added to silence PARI's warnings about the stack size.

Depends on #19933

Upstream: Reported upstream. No feedback yet.

CC: @defeo

Component: packages: standard

Author: Jeroen Demeyer

Branch/Commit: edf23e8

Reviewer: Volker Braun

Issue created by migration from https://trac.sagemath.org/ticket/19883

pjbruin commented 8 years ago
comment:2

I started working on this some time ago, but never finished it. In case you are interested, I just pushed my code to u/pbruin/pari_stack.

jdemeyer commented 8 years ago

Branch: u/jdemeyer/let_pari_handle_its_own_stack

jdemeyer commented 8 years ago

New commits:

34c2727Let PARI handle its own stack
jdemeyer commented 8 years ago

Commit: 34c2727

pjbruin commented 8 years ago
comment:6

Two small comments (no time for a complete review):

        sage_free(<void*>pari_mainstack.vbot)
        pari_mainstack.rsize = 0
        pari_mainstack.vsize = 0
        pari_mainstack.bot = 0
        pari_mainstack.vbot = 0
        pari_mainstack.top = 0
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 8 years ago

Changed commit from 34c2727 to 47f985e

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 8 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

47f985eMinor fixes
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 8 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

5bc6f27Typo
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 8 years ago

Changed commit from 47f985e to 5bc6f27

jdemeyer commented 8 years ago

Upstream: Reported upstream. No feedback yet.

jdemeyer commented 8 years ago

Description changed:

--- 
+++ 
@@ -1 +1,3 @@
 For historical reasons, the PARI stack is actually handled "manually" by Sage. The main reason was to implement auto-resizing of the stack in case a command failed due to a stack overflow. With the introduction of `parisizemax`, PARI can handle this on its own.
+
+A new patch (submitted to `pari-dev`) is added to silence PARI's warnings about the stack size.
pjbruin commented 8 years ago
comment:10

The following looks funny:

sage: pari.allocatemem(2^4, 2^64-1)
  ***   Warning: not enough memory, new stack 9223372036854775808
  ***   Warning: not enough memory, new stack 4611686018427387904
  ***   Warning: not enough memory, new stack 2305843009213693952
  ***   Warning: not enough memory, new stack 1152921504606846976
  ***   Warning: not enough memory, new stack 576460752303423488
  ***   Warning: not enough memory, new stack 288230376151711744
  ***   Warning: not enough memory, new stack 144115188075855872
  ***   Warning: not enough memory, new stack 72057594037927936
  ***   Warning: not enough memory, new stack 36028797018963968
  ***   Warning: not enough memory, new stack 18014398509481984
  ***   Warning: not enough memory, new stack 9007199254740992
  ***   Warning: not enough memory, new stack 4503599627370496
  ***   Warning: not enough memory, new stack 2251799813685248
  ***   Warning: not enough memory, new stack 1125899906842624
  ***   Warning: not enough memory, new stack 562949953421312
  ***   Warning: not enough memory, new stack 281474976710656
  ***   Warning: not enough memory, new stack 140737488355328
  ***   Warning: not enough memory, new stack 70368744177664
PARI stack size set to 1024 bytes

You may want to silence these warnings as well. I would also suggest allocatemem() should print both the actual and the maximum stack sizes with silent=False.

pjbruin commented 8 years ago
comment:11

I'm not sure what to think about the strategy to set the maximum PARI stack size to 1/4 of the maximum amount of memory that malloc() can allocate. On a certain system that I use, this causes Sage to show up in ps or top as using 129 gigabytes of virtual memory (and pari.stackmax() returns exactly 128 gigabytes). Of course this is harmless as long as the memory isn't actually used, but it looks/feels somewhat absurd.

jdemeyer commented 8 years ago
comment:12

Replying to @pjbruin:

You may want to silence these warnings as well.

I'm not convinced. This is a case where you get less memory than you explicitly asked for. I think we should really display a warning in that case.

I would also suggest allocatemem() should print both the actual and the maximum stack sizes with silent=False.

Sure.

On a certain system that I use, this causes Sage to show up in ps or top as using 129 gigabytes of virtual memory

I don't think that this is a problem. It's virtual and unused memory, so it shouldn't hurt at all.

I also wouldn't want to hardcode a parisizemax (like in your branch, you use 1G). If you have a machine with lots of RAM, you may want to actually use it.

pjbruin commented 8 years ago
comment:13

Replying to @jdemeyer:

Replying to @pjbruin:

You may want to silence these warnings as well.

I'm not convinced. This is a case where you get less memory than you explicitly asked for. I think we should really display a warning in that case.

It's fine with me either way; it's just that stackwarn.patch made me think you intended to make any memory warnings only show up with DEBUGMEM > 0.

On a certain system that I use, this causes Sage to show up in ps or top as using 129 gigabytes of virtual memory

I don't think that this is a problem. It's virtual and unused memory, so it shouldn't hurt at all.

That is clear; it is more of an aesthetic issue, so to say.

I also wouldn't want to hardcode a parisizemax (like in your branch, you use 1G). If you have a machine with lots of RAM, you may want to actually use it.

Sure, the amount in my branch was just meant to be some arbitrary default until I found a better solution (which I never got around to). I thought about making this configurable, with some reasonable default that would be sufficient for most users.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 8 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

435a4feAdd maximum stack size to message
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 8 years ago

Changed commit from 5bc6f27 to 435a4fe

vbraun commented 8 years ago
comment:15

How is Pari implementing this? In GAP its mmap without MAP_NORESERVE, so even if its just virtual memory it reserves swap. And if you do it often (parallel doctesting comes to mind) your swap is used up and virtual allocations suddenly are forced into ram.

There is sage.misc.memory_info to detect ram / swap.

Gap uses it in sage.interfaces.gap.get_gap_memory_pool_size

jdemeyer commented 8 years ago
comment:16

Replying to @vbraun:

How is Pari implementing this?

static void *
pari_mainstack_malloc(size_t size)
{
  void *b = mmap(NULL, size, PROT_READ|PROT_WRITE,
                             MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE,-1,0);
  return (b == MAP_FAILED) ? NULL: b;
}

The MAP_NORESERVE is indeed important, together with the fact that PARI tries to use a small stack if possible. The enlarged stack is only used when it's really needed.

pjbruin commented 8 years ago
comment:17

Replying to @jdemeyer:

The MAP_NORESERVE is indeed important, together with the fact that PARI tries to use a small stack if possible. The enlarged stack is only used when it's really needed.

Is this flag supported on all systems? The lines just before this function are

#ifndef MAP_NORESERVE
#define MAP_NORESERVE 0
#endif
vbraun commented 8 years ago
comment:18

Not every system has mmap or MAP_NORESERVE; If you don't then you clearly aren't interested in efficient use of memory so its ok to ignore.

jdemeyer commented 8 years ago
comment:19

Replying to @vbraun:

Not every system has mmap or MAP_NORESERVE

mmap() is defined by POSIX, so it should be available on every system where Sage is supported.

jdemeyer commented 8 years ago
comment:20

Replying to @vbraun:

How is Pari implementing this? In GAP its mmap without MAP_NORESERVE, so even if its just virtual memory it reserves swap. And if you do it often (parallel doctesting comes to mind) your swap is used up and virtual allocations suddenly are forced into ram.

I don't quite get what you mean with "reserves swap". It reserves virtual memory, but I don't think that it allocates actual pages on the swap file. That wouldn't make sense for pages which are completely unused (never been written to nor read from). But even then, you want active processes to use RAM, not swap so I don't really see the problem.

Second, what you write is only true if your OS does not overcommit memory. At least Linux overcommits by default, such that the total amount of virtual memory you can allocate is greater than RAM + swap.

So, after thinking about this a bit more, I don't think that MAP_NORESERVE is actually so important. It helps by keeping more virtual memory available, but it doesn't change at all how phyisical memory (i.e. RAM) is used.

vbraun commented 8 years ago
comment:21

Calling mmap without MAP_NORESERVE does reserve first swap and then ram, this is no joke and bit us with gap. Sure overcommit helps but there is a limit to it; Really its just the final warning before the oomkiller strikes. Overcommitting is NOT a happy place to be.

Imho the "try to allocate all memory" approach has a small chance of messing things up badly. Sure it works almost all the time but some day you'll succeed, sbrk almost everything into one process, and then everybody else on the system will have a bad day. It smells of rare and undebuggable random failures.

jdemeyer commented 8 years ago
comment:22

Replying to @vbraun:

Imho the "try to allocate all memory" approach has a small chance of messing things up badly.

I doubt it. Are there any OSes which really mess up when allocating a large amount of unused memory?

vbraun commented 8 years ago
comment:23

Malloc respects overcommit accounting, it essentially does mmap without MAP_NORESERVE (that is one of the implementations in glibc). The OS doesn't mess up, it just does what you ask it to. I.e. overcommit and, quite possibly, kill the next process that wants more memory (or just do something with his previously-malloced memory). And I really don't like my processes being killed because I'm running doctests in the background.

jdemeyer commented 8 years ago
comment:24

Does the OOMkiller really kick in based on virtual memory usage? Then you could end up in a situation where there is plenty of unused RAM but still processes would get killed. It would really surprise me if this is what actually happens.

vbraun commented 8 years ago
comment:25

Just run malloc (without free) in a for-loop, I'm pretty sure that it stops before the entire addressing space is used up.

Thats also a pretty weird use case, a process allocates memory but doesn't use it. Why should the virtual memory system cater to that and do all the extra accounting? If you don't need memory then don't ask for it. And in the extremely special case where you want to implement your own memory manager in user space there is still mmap(MAP_NORESERVE).

jdemeyer commented 8 years ago
comment:26

Replying to @vbraun:

Thats also a pretty weird use case, a process allocates memory but doesn't use it.

Then why do you think that overcommitting was invented in the first place? Because processes often ask for more memory than they actually use.

Anyway, let's stop this discussion and think about how to actually choose how much memory to allocate for the PARI stack.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 8 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

7e4919eUse virtual_memory_limit() to determine size of PARI stack
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 8 years ago

Changed commit from 435a4fe to 7e4919e

vbraun commented 8 years ago
comment:29

looks good to me

vbraun commented 8 years ago

Reviewer: Volker Braun

vbraun commented 8 years ago
comment:30

Pari testsuite fails, e.g. http://build.sagedev.org/release/builders/%20%20fast%20Volker%20MiniMac%20%28OSX%2010.10%20x86_64%29%20incremental/builds/684/steps/compile/logs/pari

jdemeyer commented 8 years ago
comment:31

The problem is that I disabled some warnings by default but the testsuite expects those warnings. Solution: increase debugmem for the PARI testsuite.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 8 years ago

Changed commit from 7e4919e to edf23e8

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 8 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

edf23e8Set debugmem=1 for testsuite
vbraun commented 8 years ago
comment:35

Fails on OSX: http://build.sagedev.org/release/builders/%20%20fast%20Volker%20MiniMac%20%28OSX%2010.10%20x86_64%29%20incremental/builds/686/steps/shell_4/logs/stdio

sage -t --long src/sage/doctest/test.py
**********************************************************************
File "src/sage/doctest/test.py", line 26, in sage.doctest.test
Failed example:
    subprocess.call(["sage", "-t", "--warn-long", "0", "longtime.rst"], **kwds)  # long time
Expected:
    Running doctests...
    Doctesting 1 file.
    sage -t --warn-long 0.0 longtime.rst
    [0 tests, ...s]
    ----------------------------------------------------------------------
    All tests passed!
    ----------------------------------------------------------------------
    ...
    0
Got:
      ***   Warning: not enough memory, new stack 1152921504606846976
      ***   Warning: not enough memory, new stack 576460752303423488
      ***   Warning: not enough memory, new stack 288230376151711744
      ***   Warning: not enough memory, new stack 144115188075855872
      ***   Warning: not enough memory, new stack 72057594037927936
      ***   Warning: not enough memory, new stack 36028797018963968
      ***   Warning: not enough memory, new stack 18014398509481984
      ***   Warning: not enough memory, new stack 9007199254740992
      ***   Warning: not enough memory, new stack 4503599627370496
      ***   Warning: not enough memory, new stack 2251799813685248
      ***   Warning: not enough memory, new stack 1125899906842624
      ***   Warning: not enough memory, new stack 562949953421312
      ***   Warning: not enough memory, new stack 281474976710656
      ***   Warning: not enough memory, new stack 140737488355328
      ***   Warning: not enough memory, new stack 70368744177664
    Running doctests with ID 2016-01-20-12-28-05-86bac659.
    Git branch: develop
    Using --optional=gcc,mpir,python2,sage
    Doctesting 1 file.
    sage -t --warn-long 0.0 longtime.rst
        [0 tests, 0.00 s]
    ----------------------------------------------------------------------
    All tests passed!
    ----------------------------------------------------------------------
    Total time for all tests: 0.9 seconds
        cpu time: 0.0 seconds
        cumulative wall time: 0.0 seconds
    0
**********************************************************************
jdemeyer commented 8 years ago

Dependencies: #19933

vbraun commented 8 years ago

Changed branch from u/jdemeyer/let_pari_handle_its_own_stack to edf23e8