lowleveldesign / process-governor

This application allows you to put various limits on Windows processes.
MIT License
627 stars 62 forks source link

How can I "reset" the working set limit of a running process? #24

Closed ghost closed 3 years ago

ghost commented 3 years ago

Problem

I am trying to use process-governor to temporarily limit the memory certain python processes can use on a shared windows server.

I can set the maximum working set in an administrator prompt for a certain PID:

PS > .\procgov64.exe --minws 10M --maxws 50M --pid 28244
Process Governor v2.7.21075.4 - sets limits on your processes
Copyright (C) 2019 Sebastian Solnica (lowleveldesign.org)

CPU affinity mask:                      (not set)
Max CPU rate:                           (not set)
Max bandwidth (B):                      (not set)
Maximum committed memory (MB):          (not set)
Minimum WS memory (MB):                 10
Maximum WS memory (MB):                 50
Preferred NUMA node:                    (not set)
Process user-time execution limit (ms): (not set)
Job user-time execution limit (ms):     (not set)
Clock-time execution limit (ms):        (not set)

Press Ctrl-C to end execution without terminating the process.

In the Task Manager I see that PID 28244 indeed does not use more than 50M of memory. However, now I want to lift this restriction.

What I've tried

Stopping procgov

Pressing CTRL+C in the prompt to end procgov does not lift the working set restriction.

Setting a high maxws limit

Setting a (very) high working set also does not lift the restrictions:

PS > .\procgov64.exe --minws 10M --maxws 50000M --pid 28244
Process Governor v2.7.21075.4 - sets limits on your processes
Copyright (C) 2019 Sebastian Solnica (lowleveldesign.org)

CPU affinity mask:                      (not set)
Max CPU rate:                           (not set)
Max bandwidth (B):                      (not set)
Maximum committed memory (MB):          (not set)
Minimum WS memory (MB):                 10
Maximum WS memory (MB):                 50,000
Preferred NUMA node:                    (not set)
Process user-time execution limit (ms): (not set)
Job user-time execution limit (ms):     (not set)
Clock-time execution limit (ms):        (not set)

Press Ctrl-C to end execution without terminating the process.

PID 28244 still uses no more than 50M, even though the process itself requires more.

Setting maxmem instead of maxws

This does not work with the python processes as they simply stop with an out of memory error when they reach the limit.

Context

The python processes I try to manage are Jupyter calculation kernels. During the day the shared machine these kernels run on is used by multiple people, and a memory limit on the process is necessary. Otherwise the RAM becomes full and RDP connections to the machine start to drop.
During the night the machine is not used by many people and the same python process should be allowed to use as much memory as it needs. Therefore I am looking to apply and lift the restrictions to a certain PID periodically.

lowleveldesign commented 3 years ago

Thanks for the detailed issue description. Procgov internally uses Windows Job Objects to set process limits. The limits still work after stopping procgov because, on Windows, once you assign a Job Object to a process, you can't remove it. It does not mean we can't lift the limits or update them (the second thing you tried). Unfortunately, procgov currently can't do that (it creates a unique Job Object each time you run it, so the previous memory limits take precedence as they are more restricting). I will look into making the updates work.

lowleveldesign commented 3 years ago

@ba-tno , please try the 2.8 pre-release version and let me know if the limit updates work for you.

ghost commented 3 years ago

Thanks, I've tested it below.

I've run a Python script several times in the same Jupyter kernel (= same PID) and applied procgov with different --maxws settings on it. I repeated this process with the v2.7.21075.4 and with v2.8.21194.5 of procgov64.exe

The code can be found in the fold, results are below. Unfortunately it appears that there is no difference between the versions regarding the limit.

Python code ```py import numpy as np # For the array import sys # For getting the size of an object import psutil # For getting the process info import gc # For deleting the objects from memory # Get the proces information # (this stays the same as it runs in a single Jupyter kernel) process = psutil.Process() print(f"PID: {process.pid}") # Set number of loops N = 15 def populate_memory(N): # Create a random 100 row by 10 column array arr = np.random.random([100, 10]) # Print header print("iter\tobjmem\twset") # Loop from 1 to N (inclusive) times and append the array to itself. for i in range(0, N): arr = np.append(arr, arr) print(f"{i}\t{round(sys.getsizeof(arr)/(1024*1024), 2)}\t{round(process.memory_info().wset/(1024*1024), 2)}") # Mark the array for deletion and garbage collect it. del arr gc.collect() if __name__ == "__main__": ### Run loops populate_memory(N) ```

No limits

iter objmem (MB) wset (MB)
0 0.02 120.26
1 0.03 120.26
2 0.06 120.32
3 0.12 120.44
4 0.24 120.68
5 0.49 121.17
6 0.98 122.15
7 1.95 122.21
8 3.91 124.16
9 7.81 128.07
10 15.63 135.88
11 31.25 151.51
12 62.5 182.76
13 125.0 245.26
14 250.0 370.26

maxws set to 50M

PS > .\procgov64.exe --minws 10M --maxws 50M --pid #####

Iteration 14 ended on a working set of 49.39 MB (v2.7) and 48.78 MB (v2.8)

maxws set to 100M

PS > .\procgov64.exe --minws 10M --maxws 100M --pid #####

Iteration 14 ended on a working set of 49.15 MB (v2.7) and 48.90 MB (v2.8)

maxws set to 30M

PS > .\procgov64.exe --minws 10M --maxws 30M --pid #####

Iteration 14 ended on a working set of 29.27 MB (v2.7) and 28.63 MB (v2.8)

procgov ended with CTRL+C

Iteration 14 ended on a working set of 29.90 MB (v2.7) and 28.63 MB (v2.8)

lowleveldesign commented 3 years ago

@ba-tno, thank you again for the great description. I see the problem, and I can reproduce it. I wanted to make the release asap, and I did only a few tests, checking the job object properties reported by Process Explorer.

The problem is that the OpenJobObject function does not find a job object when there are no open handles to it (this issue describes the problem quite well). So when procgov terminates, we lose a way to reopen the job object, but, interestingly, the limits still apply. It is then possible to create a new job object with the same name, which fooled me in my tests. I am now thinking about duplicating the handle in the target process (as proposed in the linked forum post). I should also "merge" the limits on update instead of replacing them. To summarize, please stay tuned, and I will try to make a new release, maybe even this week.

lowleveldesign commented 3 years ago

@ba-tno please try the updated release and let me know if it works for you. Provgov should now correctly update the current job limits.

I used the code you published and added a pause key after each iteration. Here is my test:

PS testapps> python .\pythonmem.py
PID: 12884
iter    objmem  wset
0       0.02    26.12
Press enter

Then I run: procgov64.exe --minws 10M --maxws 20M -p 12884. Stopped procgov and continued the python run:

1       0.03    19.85
Press enter

We can see the WS was limited to 20M. Then I incremented it to 40MB: procgov64.exe --minws 10M --maxws 40M -p 12884:

2       0.06    20.2
Press enter
3       0.12    20.32
Press enter
4       0.24    20.57
Press enter
5       0.49    21.06
Press enter
6       0.98    21.06
Press enter
7       1.95    22.03
Press enter
8       3.91    23.98
Press enter
9       7.81    27.89
Press enter
10      15.63   32.31
Press enter
11      31.25   39.81
Press enter
12      62.5    39.95
Press enter

Then I set the limit to 100MB: procgov64.exe --minws 10M --maxws 100M -p 12884

13      125.0   100.0
Press enter
14      250.0   99.6
Press enter
ghost commented 3 years ago

Wonderful! I can confirm that this works in v2.8.21196.2 Thanks for the quick resolution!