vermaseren / form

The FORM project for symbolic manipulation of very big expressions
GNU General Public License v3.0
982 stars 118 forks source link

Bounding memory usage of (T)FORM via form.set? #369

Closed magv closed 3 years ago

magv commented 3 years ago

Hi, folks. This is a help request rather than a bug report. When running (T)FORM jobs on shared machines there is always a limitation on how much memory can be devoted to it: exceeding it means the job will be killed. Now, the memory usage of FORM can be adjusted by changing form.set parameters: there are however many of them (WorkSpace, ScratchSize, HideSize, LargeSize, SmallSize, MaxTermSize, SortIOSize, etc), with some being interdependent in non-trivial ways. So, could you devise a recommended way to fill in form.set? Specifically, if one intends to run tform with N worker threads (between 1 and e.g. 32) and needs to make sure it never uses more that X GB of RAM (between 1 and e.g. 1000), what should one put into form.set?

tueda commented 3 years ago

You can try a Python script formset.py in tueda/formset.

python3 formset.py --help  # print help

python3 formset.py         # print a recommented parameter set

python3 formset.py -o      # write a recommented parameter set to form.set
vermaseren commented 3 years ago

There is one caveat. When you use many $-variables, the memory use may slowly increase during the running. In that case you need to keep some reserve.

The problem you run into is due to system administrators not reserving swap space on the disk. For batch machines that is the normal way, because severe swapping can mess up other jobs on the same machine. At Nikhef we have some dedicated Form computers and for those I have always an amount of swapping space that is more or less equal to the amount of physical memory. Because parts of the allocated memory is used only rarely, this does not slow down the program, but it avoids such memory crashes, unless you try to allocate really a lot. (many times the amount of CPU memory). The result is that that I run only into memory problems when I have a severe memory leak, which I then have to repair urgently.

On 13 Nov 2020, at 10:35, Takahiro Ueda notifications@github.com wrote:

You can try a Python script formset.py in tueda/formset https://github.com/tueda/formset.

python3 formset.py --help # print help

python3 formset.py # print a recommented parameter set

python3 formset.py -o # write a recommented parameter set to form.set — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/vermaseren/form/issues/369#issuecomment-726657607, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABJPCETE5G5K3VE3LLK2XI3SPT4WHANCNFSM4TUKVH2A.

magv commented 3 years ago

Takahiro, thanks, formset.py seems to be what I'm looking for. Can I ask why do you not increase WorkSpace? When e.g. pySecDec is calling FORM, WorkSpace is the only parameter it has the option to vary; clearly the developers thought it is the most important one (and they do vary it quite a bit between the examples). Was that a bad idea on their part?

Jos, it is even worse than that: on the cluster I'm currently targeting every job is submitted through a special command, and a fixed maximum memory size needs to be specified; this is then enforced by making allocation fail beyond the reserved maximum. But lack of swap is also a problem on e.g. development laptops.

vermaseren commented 3 years ago

Hi Vitaly,

It is the same on our batch systems. You have to specify a memory limit and a time limit. If you go beyond this the job crashes. This seems to be the standard batch system. It was a good argument to get our own Form computers with a different amount of memory, different disks and different SSD. Of course, you may not be in the position to get your own dedicated computers. Our batch systems are mainly designed for the experimentalists who are running Monte Carlo’s. That gives completely different requiresments. They do not have the intermediate expression swell that is typical for big symbolic calculations. Somehow I managed to set some decent swap space on my Apple laptop, but I forget how I did that. Linux is far more flexible in this respect. I have no idea how this is wth windows.

On 13 Nov 2020, at 12:28, Vitaly Magerya notifications@github.com wrote:

Takahiro, thanks, formset.py seems to be what I'm looking for. Can I ask why do you not increase WorkSpace? When e.g. pySecDec is calling FORM, WorkSpace is the only parameter it has the option to vary; clearly the developers thought it is the most important one (and they do vary it quite a bit between the examples). Was that a bad idea on their part?

Jos, it is even worse than that: on the cluster I'm currently targeting every job is submitted through a special command, and a fixed maximum memory size needs to be specified; this is then enforced by making allocation fail beyond the reserved maximum. But lack of swap is also a problem on e.g. development laptops.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/vermaseren/form/issues/369#issuecomment-726713168, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABJPCEQAVWJ7QX3GEY3GSELSPUJ6NANCNFSM4TUKVH2A.

tueda commented 3 years ago

Can I ask why do you not increase WorkSpace?

I think the performance doesn't change much with WorkSpace (and MaxTermSize etc.) if you have enough big number; it just affects whether or not your problem can be solved. If your MaxTermSize or Workspace is not enough to handle your expressions, then FORM just complains it and stops.

You can specify Workspace and other parameters, like

python3 formset.py maxtermsize=200K workspace=200M

then the script sets other buffer sizes to fit the rest of the memory space (+ reserved space, which can be changed by the -p option).

magv commented 3 years ago

I did some limited tests: the settings formset.py produces do seem to limit RSS correctly. On the other hand tform reserves more virtual memory than that (in one test it reserved 33.1 GB using the settings that limit memory usage to 30 GB). It would be great if virtual memory usage could be estimated the same way RSS is estimated, but if not, applying a factor 1.1 is not too bad. Anyway, thanks for the help.

jodavies commented 3 years ago

In my experience 1.1 is far too small, if you need to limit VIRT. More like a factor of 2 is necessary. Even with a script like this it can be very hard to configure the buffer sizes. For example, with a form.set generated by -n 8 -p 100 --total-memory 28G --total-cpus 8 the following script will completely blow through 32GB ram and 8GB swap, before my OS kills the process, at which point VIRT is over 90GB according to htop.

#-
On fewerstats 0;

CFunction f;
Symbol x;

#define NTERMS "500000000"

Local test1 = <f(1)>+...+<f(`NTERMS')>;
.sort

* Cancel all terms, but keep distance so that most terms only cancel in the final sort
Identify f(x?) = f(x) - f(`NTERMS'-x+1);

Print +s;
.end

EDIT: actually, swapping ... for sum_ allows this test to run. I should re-run some of my tests... It seems one should not abuse the preprocessor to that extent. Also use of dollar variables, argtoextrasymbol, polyratfun etc means your memory is essentially unbounded, depending what exactly your script does. Clusters which enforce a strict virtual memory limit are not easy to work with indeed...

tueda commented 3 years ago

The script calculates the (fixed) initial buffer sizes, so it can't predict about dynamically allocated buffers, like compiler buffers (<f(1)>+...+<f(`NTERMS')> in your case), $-variables, polynomials etc.