scipopt / JSCIPOpt

Java interface for the SCIP Optimization Suite
MIT License
62 stars 36 forks source link

Degrading performance on subsequent models after time limit has passed #54

Open patrickguenther opened 3 days ago

patrickguenther commented 3 days ago

When optimizing multiple models one after another in the same jvm process, I noticed the time taken for optimization increases massively after the time limit parameter's time has passed.

On my machine, the first couple of optimizations take approximately 10 - 40 ms. In the test below I'm using a ten second time limit for SCIP, so the solution should be found well within the time limit. After each optimization I let the thread sleep for a second just to waste some time. On the 9th or 10th iteration, however, the time taken for an optimization suddenly jumps to 450 to 500 ms.

By changing the time limit parameter to 5 seconds the error occurs earlier and by removing the line setting the time limit, the error disappears (probably because the default time limit is infinite).

I'm using SCIP version 9.2.0 with the binary distribution. The issue occurs both on Windows and Linux machines.

Java file for reproducing the issue:

import jscip.Scip;

public class DegradingPerformanceTest {

    public static void main(final String... args) {
        System.loadLibrary("jscip");
        for (int i = 0; i != 20; ++i) {
            final long timeTaken = optimizeAndTime();
            if (i > 2) { // the first couple of iterations are for JVM warmup
                if (timeTaken > 300L) {
                    throw new IllegalStateException("Optization took <" + timeTaken + " ms> on iteration <" + i + ">");
                }
            }
            System.out.println("Optimization <" + i + "> took <" + timeTaken + " ms>");
            // sleeping to waste time until the time limit passes
            sleep(1000L);
        }
    }

    private static void sleep(final long millis) {
        try {
            Thread.sleep(millis);
        } catch (final InterruptedException e) {
            throw new IllegalStateException("Sleeping interrupted", e);
        }
    }

    private static long optimizeAndTime() {
        final Scip scip = buildAndParameterizeScip();
        scip.readProb("scipPerformanceDegradation.lp");
        final long start = System.currentTimeMillis();
        scip.solve();
        final long duration = System.currentTimeMillis() - start;
        scip.free();
        return duration;
    }

    private static Scip buildAndParameterizeScip() {
        final Scip scip = new Scip();
        scip.create("scipProblem");
        scip.setRealParam("numerics/epsilon", 1.0e-9);
        scip.setRealParam("numerics/sumepsilon", 1.0e-6);
        scip.setRealParam("numerics/feastol", 1.0e-6);
        // The time limit seems to be the culprit. Removing the following line, the error never appears. Probably
        // because the time limit is infinite.
        scip.setRealParam("limits/time", 10);
        scip.setRealParam("limits/gap", 1.0e-4);
        scip.setRealParam("limits/absgap", 1.0e-4);
        scip.setIntParam("limits/solutions", -1);
        scip.setLongintParam("limits/nodes", 10000);
        return scip;
    }

}

The only workaround right now, I see is to not use a time limit parameter. See attached zip file for the required lp file. performanceDegradation.zip

kkofler commented 3 days ago

The most likely cause for such a performance degradation is a memory leak somewhere. Maybe the one you reported 2 days ago (#53), maybe another one somewhere else.

patrickguenther commented 2 days ago

Thanks for your replies. I'll give your solutions a try on Monday. But I don't think the two issues are related. This issue is purely influenced by the time limit I set for SCIP and the above java code does not attach any message listeners.

kkofler commented 2 days ago

So the question is: Are the problems objectively taking longer to solve over time or is SCIP just thinking that they take longer, because some timer is not or not completely reset between the instances?

patrickguenther commented 1 day ago

It is definitely objective. The test above measures the wall time using System.currentTimeMillis(). I doubt SCIP has the ability to mess with that. If the test is changed to not throw, when the 300 milliseconds are exceeded, you also see the log trickling in slower.

Originally I thought, SCIP is just 10 times slower than Gurobi or XPress. I only noticed recently that when the app I'm working on triggers optimizations within the first 30 seconds after starting up, the results come back much faster. After some trial and error, I was finally able to trace back the slow-down to the 30 second time limit I always set by default and extract the minimum example above.

And by changing the "limits/time" parameter in the example, you can control when the slow-down starts to occur. My guess is, this is actually an issue with SCIP itself and not with the JSCIP wrapper. I would also be willing to investigate further but my C-skills are a bit rusty to say the least. Is there a guide or something on setting up a SCIP development environment with debugging?

kkofler commented 1 day ago

I can only guess that there is some resource leak (e.g., memory, file descriptors, etc.) that is causing the slowdown. Because some operations on a class of resources get slower the more instances of that resource are currently in use. But I cannot pinpoint what is leaked where.

And by the way, how SCIP compares to commercial solvers performance-wise strongly depends on the individual problem. I have seen problems where it is way more than 10 times slower than Gurobi even on the first run, where definitely no resources were leaked yet.