artofscience / SAOR

Sequential Approximate Optimization Repository
GNU General Public License v3.0
5 stars 1 forks source link

Subsolvers #38

Closed artofscience closed 2 years ago

artofscience commented 3 years ago

I converged to get the solvers working. Right now we have the reference solver: SvanbergIP and 3 own written interior-point solvers, subsequently

  1. ipx (only x variables) inherits from InteriorPoint
  2. ipxy (x and y variables) inherits from ipx
  3. ipxyz (x, y and z variables) inherits from ipxy

test_square.py compares the different solvers for a simple problem.

@MaxvdKolk

  1. around the 6th design iteration or so, the ipxy solver start to deviate from SvanbergIP. Can you try to find where this is coming from? ( also @Giannis1993 )
  2. i gathered the ip* solvers in interior_point.py and used inheritance. can you look for manners to limit the code duplication? since ipxy(ipx) and ipxyz(ipxy).
  3. can you profile the code?
MaxvdKolk commented 3 years ago

Sorry for the late reply, but didn't get much time to look into this. I did some basic profiling, but nothing major shows up. It seems the majority of the time is within get_newton_direction and get_residual. For the first part, there is the Alam computation for some, that is costly, but I didn't see a clear approach to simplify those operations, as it just seems a dot product that has to be performed.

For the get_residual (and also at some other places) the evaluation of g, dg, ddg appear as well. I have the impression this is due to their allocations on every call. For instance, for the used problem Problems.square.py

def g(self, x):
    return np.array([np.dot(x, x), 1-np.sum(x)])

def dg(self, x):
    return np.array([2*x, -np.ones_like(x)])

def ddg(self, x):
    return np.array([2*np.ones_like(x), np.zeros_like(x)])

Every call returns a new array, although for g this is less of a problem as it is simply a scalar. Possibly an improvement here, would be to provide a out=array argument, similar to various numpy operations to allow these functions to write in already allocated arrays.

def dg(self, x, out=None):
    if out is None:
        return np.array(...)
    else:
        out[:, 1], out[:, 2] = 2*x, -1
        return out

And then change the corresponding lines within the solvers to use the already allocated arrays on subsequent calls to g, dg, ddg. This probably only influence the timing for larger problems though

aatmdelissen commented 3 years ago

Issue 1 Line 111 of interior_point.py takes 13% of the time.

 rnew = np.linalg.norm([np.linalg.norm(i) for i in self.r]) 

A norm of a norm is different than an norm of a concatenated array, which is what happens in Svanberg's IP: hstack(...) and then sqrt(dot(..)). Stacking and then taking the norm might also be faster.

Issue 2 Line 151 of interior_point_x.py

A = diags(diag_lambda) + dg[1:].dot(diags(1/diag_x) * dg[1:].transpose())

Two sparse (diagonal) matrices are created. This line took 0.97s. The final result is a dense matrix, so it doesn't make sense to make it sparse. I changed it to A = np.diag(diag_lambda) + np.einsum("ki,i,ji->kj", dg[1:], 1/diag_x, dg[1:]) Now it takes only 0.053s.

Issue 3 Calculating the residual, interior_point_x.py, line 117 needs the derivatives dg. Now they are calculated two times in the line

self.r[0] = self.dg(self.w[0])[0] + self.w[3].dot(self.dg(self.w[0])[1:]) - self.w[1] + self.w[2]

I've changed it to

dg_cur = self.dg(self.w[0])
self.r[0] = dg_cur[0] + self.w[3].dot(dg_cur[1:]) - self.w[1] + self.w[2]

Time before = 2.44s, time after = 1.31 s

Conclusion Issue 2 and 3, I've pushed to the dev branch. With these updates, we already saved a couple of seconds. Before ~6-7 seconds, after ~4-5 seconds. I think with Max's suggestion we can save even more! (I only applied the changes to interior_x)

artofscience commented 3 years ago

Sorry for the late reply, but didn't get much time to look into this. I did some basic profiling, but nothing major shows up. It seems the majority of the time is within get_newton_direction and get_residual. For the first part, there is the Alam computation for some, that is costly, but I didn't see a clear approach to simplify those operations, as it just seems a dot product that has to be performed.

For the get_residual (and also at some other places) the evaluation of g, dg, ddg appear as well. I have the impression this is due to their allocations on every call. For instance, for the used problem Problems.square.py

def g(self, x):
    return np.array([np.dot(x, x), 1-np.sum(x)])

def dg(self, x):
    return np.array([2*x, -np.ones_like(x)])

def ddg(self, x):
    return np.array([2*np.ones_like(x), np.zeros_like(x)])

Every call returns a new array, although for g this is less of a problem as it is simply a scalar. Possibly an improvement here, would be to provide a out=array argument, similar to various numpy operations to allow these functions to write in already allocated arrays.

def dg(self, x, out=None):
    if out is None:
        return np.array(...)
    else:
        out[:, 1], out[:, 2] = 2*x, -1
        return out

And then change the corresponding lines within the solvers to use the already allocated arrays on subsequent calls to g, dg, ddg. This probably only influence the timing for larger problems though

@MaxvdKolk nice idea. Is it possible for you to try this out and check for a large problem whether this is a substantial saving?

MaxvdKolk commented 3 years ago

Sure, I probably have some time before Friday to test whether or not this has any influence on the overall performance.

artofscience commented 3 years ago

@MaxvdKolk what is the status of this issue wrt adding out=None? I assume this is still relevant, but is this the speed bottleneck now?