BYU-PRISM / GEKKO

GEKKO Python for Machine Learning and Dynamic Optimization
https://machinelearning.byu.edu
Other
580 stars 103 forks source link

GEKKO and parallelization #61

Closed wingmanzz closed 3 weeks ago

wingmanzz commented 5 years ago

Hello,

Ive just started using GEKKO and so far so good--I do have an issue though, I have an parallel process which calls my GEKKO solver. After looking at the output from several runs, it was obvious the solutions reported from each of the parallel jobs were NOT stable. It may produce anywhere from 25-50 of the 'correct' solutions. If I make the jobs sequential again, it works perfectly, and produces all 100+ solutions submitted to the solver.

After looking at the docs, I am assuming this is might be because the backend APMonitor is writing to a file and each of the parallel jobs is writing over the file before and during the other jobs being solved by GEKKO..does anyone have a suggestion for making GEKKO parallel-safe?

WORKS: ` pool = Pool(processes=1)

results = [pool.apply(func,
                           args=(row['factor1'], row['factor2'])).get()
               for index, row in df.iterrows()]`

DOES NOT WORK: ` pool = Pool(processes=max_jobs)

results = [pool.apply_async(func, callback=handle_result,
                           args=(row['factor1'], row['factor2']))
               for index, row in df.iterrows()]`

Where func calls a GEKKO 'solver'.

APMonitor commented 5 years ago

If you would like to parallelize I recommend that you create separate GEKKO instances by calling m=GEKKO(remote=False) for each instance. One advantage of using a prior directory is that GEKKO attempts to restart from the prior solution and there can be some speed improvements from the Warm start.

You can clean up the local files by using m.cleanup() after the instance has completed. I recommend the clean up so that you don't fill up your computer hard drive with temp files for very large jobs. For small jobs, this isn't a problem because the temp files are regularly removed with Disk Cleanup. However, if you are running many millions of parallel jobs, this could be an issue.

Here is another example of multi-threading with Python:

import numpy as np
import threading
import time, random
from gekko import GEKKO

class ThreadClass(threading.Thread):
    def __init__(self, id, server, ai, bi):
        s = self
        s.id = id
        s.server = server
        s.m = GEKKO()
        s.a = ai
        s.b = bi
        s.objective = float('NaN')

        # initialize variables
        s.m.x1 = s.m.Var(1,lb=1,ub=5)
        s.m.x2 = s.m.Var(5,lb=1,ub=5)
        s.m.x3 = s.m.Var(5,lb=1,ub=5)
        s.m.x4 = s.m.Var(1,lb=1,ub=5)

        # Equations
        s.m.Equation(s.m.x1*s.m.x2*s.m.x3*s.m.x4>=s.a)
        s.m.Equation(s.m.x1**2+s.m.x2**2+s.m.x3**2+s.m.x4**2==s.b)

        # Objective
        s.m.Obj(s.m.x1*s.m.x4*(s.m.x1+s.m.x2+s.m.x3)+s.m.x3)

        # Set global options
        s.m.options.IMODE = 3 # steady state optimization
        s.m.options.SOLVER = 1 # APOPT solver

        threading.Thread.__init__(s)

    def run(self):

        # Don't overload server by executing all scripts at once
        sleep_time = random.random()
        time.sleep(sleep_time)

        print('Running application ' + str(self.id) + '\n')

        # Solve
        self.m.solve(disp=False)

        # Results
        #print('')
        #print('Results')
        #print('x1: ' + str(self.m.x1.value))
        #print('x2: ' + str(self.m.x2.value))
        #print('x3: ' + str(self.m.x3.value))
        #print('x4: ' + str(self.m.x4.value))

        # Retrieve objective if successful
        if (self.m.options.APPSTATUS==1):
            self.objective = self.m.options.objfcnval
        else:
            self.objective = float('NaN')
        self.m.cleanup()

# Select server
server = 'https://byu.apmonitor.com'

# Optimize at mesh points
x = np.arange(20.0, 30.0, 2.0)
y = np.arange(30.0, 50.0, 2.0)
a, b = np.meshgrid(x, y)

# Array of threads
threads = []

# Calculate objective at all meshgrid points

# Load applications
id = 0
for i in range(a.shape[0]):
    for j in range(b.shape[1]):
        # Create new thread
        threads.append(ThreadClass(id, server, a[i,j], b[i,j]))
        # Increment ID
        id += 1

# Run applications simultaneously as multiple threads
# Max number of threads to run at once
max_threads = 8
for t in threads:
    while (threading.activeCount()>max_threads):
        # check for additional threads every 0.01 sec
        time.sleep(0.01)
    # start the thread
    t.start()

# Check for completion
mt = 3.0 # max time
it = 0.0 # incrementing time
st = 1.0 # sleep time
while (threading.activeCount()>=1):
    time.sleep(st)
    it = it + st
    print('Active Threads: ' + str(threading.activeCount()))
    # Terminate after max time
    if (it>=mt):
        break

# Wait for all threads to complete
#for t in threads:
#    t.join()
#print('Threads complete')

# Initialize array for objective
obj = np.empty_like(a)

# Retrieve objective results
id = 0
for i in range(a.shape[0]):
    for j in range(b.shape[1]):
        obj[i,j] = threads[id].objective
        id += 1

# plot 3D figure of results
from mpl_toolkits.mplot3d import Axes3D
import matplotlib.pyplot as plt
from matplotlib import cm
import numpy as np

fig = plt.figure()
ax = fig.gca(projection='3d')
surf = ax.plot_surface(a, b, obj, \
                       rstride=1, cstride=1, cmap=cm.coolwarm, \
                       vmin = 12, vmax = 22, linewidth=0, antialiased=False)
ax.set_xlabel('a')
ax.set_ylabel('b')
ax.set_zlabel('obj')
ax.set_title('Multi-Threaded GEKKO')
plt.show()

See https://apmonitor.com/me575/index.php/Main/ParallelComputing

parallel_gekko

loganbeal commented 5 years ago

GEKKO should create unique directories for each model created, even in parallel. If you're solving the problems remotely, the issue is likely overlapping directories on the server side. The server creates a directory based on model name and IP address. GEKKO increments default model names with a global model instance counter to avoid this problem, but parallelizing probably doesn't share the same instance counter so you get overlapping model names. Solutions involve solving locally, as @APMonitor suggested, or forcing model names to be unique.

@APMonitor, the solution to Issue #30 would resolve this problem too.

wingmanzz commented 5 years ago

Thanks for everyones help onthis--i was able to get get it mostly working --however there maybe one more issue Id like a little insight on if possible.

I am now running GEKKO with (remote=False) and this spawns an apm monitor subprocess, which is not stopped until the the entire program exits I think. This causes the check for active children processes to always return > 1.

I have tried a 'del' on the GEKKO object, but it does not clean up the subprocess. Might it be best if the

cleanup() method of the GEKKO class killed it's corresponding apm process? If not how would you recommend killing the apm subprocess if your parent process is long-running?

APMonitor commented 5 years ago

One option is to run an OS command to kill any apm.exe processes for Windows:

taskkill /F /IM apm.exe

or from Python with

import os
os.system("taskkill /f /im apm.exe")

Does this help? I suppose that we could have a separate function such as m.kill() that would do this by the PID or m.kill_all() for stopping all apm.exe. We'd likely need to identify the OS and use different system commands based on Windows, MacOS, or Linux. Thoughts?

loganbeal commented 5 years ago

@APMonitor, I bet the f2py dll/so alternative we discussed would avoid problems like this that come from calling exe subprocesses.

abe-mart commented 5 years ago

One potential disadvantage of using f2py if I understand it correctly, is that the Fortran code would have to be compiled specifically for the user's version of python, so you would either have to compile the Fortran during the installation process, or maintain a large library of binary files.

On Wed, Jul 24, 2019, 5:18 AM loganbeal notifications@github.com wrote:

@APMonitor https://github.com/APMonitor, I bet the f2py dll/so alternative we discussed would avoid problems like this that come from calling exe subprocesses.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/BYU-PRISM/GEKKO/issues/61?email_source=notifications&email_token=AEUJHIJNW5OLJLWT2TY32BLQBBCCTA5CNFSM4IEAUAJ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2WEWHA#issuecomment-514607900, or mute the thread https://github.com/notifications/unsubscribe-auth/AEUJHIN2N2TND4SM33V2XITQBBCCTANCNFSM4IEAUAJQ .

loganbeal commented 5 years ago

@abe-mart there is some dependence on version, but I didn't think it was that extreme. Numpy does this and I think they figured out how to have pip deliver only the relevant files to the user's system. I never figured that out.

APMonitor commented 4 years ago

@abe-mart I compile 6 binaries right now (Windows, Linux, MacOS, and Linux ARM with public server and local versions for Windows and MacOS with different solver options that can be released). You are correct that I'd need to produce another set for the library option.

@loganbeal I like the option of downloading only the relevant binaries so that the pip install doesn't get too big. Right now it is about 10MB and the extra so/dll files would likely take it to 20MB.

@wingmanzz did it help to use the kill option? On starting the APM process, we could also keep track of a PID on the OS and kill just that process if you need more fine-grain control.

wingmanzz commented 4 years ago

Yes, the kill did work--but it also turned out I was running into a deadlock situation with lines 1824-1831 in gekko.py

                if debug >= 1:
                    # Start recording output if error is detected
                    if '@error' in line:
                        record_error = True
                    if record_error:
                        apm_error+=line
                app.wait()

These particular section caused me a deadlock (and the apm processes to never quit) when I run processes in parrallel on a local server. For the time being, I have just commented them out, but I think it has something to do with the pipe buffer deadlock problem mentioned here:

https://stackoverflow.com/questions/2381751/can-someone-explain-pipe-buffer-deadlock

APMonitor commented 4 years ago

Thanks for pointing out this issue. I've also seen the deadlock issue when running locally, even not in parallel so it is nice to have a lead on tracking down this problem. You could also remove this problem by calling m.solve(debug=0) with option debug=0 to avoid the locking section. Here is some additional documentation that I found:

https://docs.python.org/3.7/library/subprocess.html?highlight=subprocess#popen-objects

Note This will deadlock when using stdout=PIPE or stderr=PIPE and the child process generates enough output to a pipe such that it blocks waiting for the OS pipe buffer to accept more data. Use Popen.communicate() when using pipes to avoid that.

            app = subprocess.Popen([apm_exe, self._model_name], stdout=subprocess.PIPE, \
                                   stderr=subprocess.PIPE,cwd = self._path, \
                                   env = {"PATH" : self._path }, universal_newlines=True)

            for line in iter(app.stdout.readline, ""):
                if disp == True:
                    try:
                        print(line.replace('\n', ''))
                    except:
                        pass
                else:
                    pass
                if debug >= 1:
                    # Start recording output if error is detected
                    if '@error' in line:
                        record_error = True
                    if record_error:
                        apm_error+=line
                app.wait()
            _, errs = app.communicate()

Here are a few things that I'll try:

  1. Use app.communicate instead of app.stdout.readline - it wasn't as simple as just replacing it so I'll need to dig more.
  2. Write the solver output to a file instead of using PIPES. If we want to preserve the line-by-line display, I could read the file repeatedly but this would increase hard-drive I/O and potentially slow down the total solve time.

Any other ideas?

APMonitor commented 3 weeks ago

Deadlock issue is resolved with no command line output for remote=False.