[NMF] Benchmark results

tsdh commented 9 years ago

Hi Lucia & Georg,

you solution paper doesn't include any facts about the performance of the NMF solution except that the incremental version is up to 4.8 times faster than the batch version. Do you have some concrete numbers?

E.g., something along the lines of: we've run the benchmarks on a machine with the following spec (X CPUs with X GHz, X GB RAM, etc), the largest model that could be checked and repaired was X, and there the initial import took X seconds, and the 10 iterations of repair & recheck took X seconds in total.

tsdh commented 9 years ago

Ah, and another thing: you argue that the sorting of the matches inflates the performance results but the sorting time shouldn't be considered when measuring the runtimes. It's just needed in the original benchmark framework so that in all iterations, the same elements are fixed in all solutions.

dwagelaar commented 9 years ago

N.b. the sorting time DOES count for the global timeout. Op 29-mei-2015 18:40 schreef "Tassilo Horn" notifications@github.com:

Ah, and another thing: you argue that the sorting of the matches inflates the performance results but the sorting time shouldn't be considered when measuring the runtimes. It's just needed in the original benchmark framework so that in all iterations, the same elements are fixed in all solutions.

— Reply to this email directly or view it on GitHub https://github.com/FTSRG/trainbenchmark-ttc/issues/13#issuecomment-106866909 .

tsdh commented 9 years ago

Really? That explains a bit why my solution is faster when run standalone instead of in the project. ;-)

That said, IMO it shouldn't count.

szarnyasg commented 9 years ago

@tsdh it does count for the global timeout, but it does not count for the execution times shown on the plots.

tsdh commented 9 years ago

@szarnyasg Ah, that makes sense.

georghinkel commented 9 years ago

The NMF solution cannot use the Benchmark framework for the technology gap to the Java platform. Therefore, I recreated also the sorting as done by the benchmark framework. I am currently not exactly sure whether this is counted in the times and in which metric.

The reason for not putting statements on the performance in the paper is that I did not the initial paper to be too different to the one that is going to be published in the pre-proceedings - where there is a page limit of 5 pages. Therefore, I did not discuss performance at all, same also for the Java Refactoring Case solution. I thought that if the page limit is set to 5 pages, then performance findings would rather go into a section of the prospected per-case-journal-paper. The 4,8 is not really related to the case, it is rather a number that we have measured for NMF Expressions before.

tsdh commented 9 years ago

@georghinkel In the README you tell us reviewers not to trust the runtimes on SHARE because Mono was slower than .NET on windows. That's ok but I have no Windows machine where I could run your solution. If you had some numbers in your paper, I'd simply believe them.

Well, that's not too important because the performance evaluation is done by the case authors anyway. But a runtime of 43 minutes for the model of size 32 and the SemaphoreNeighbor quuery with the incremental version seems quite slow, and this can't be exclusively mono's fault, no?

tsdh commented 9 years ago

Oh, well. When running the same query on the same model with the --batch variant, it finishes in just 13 seconds. Why is the incremental version so much slower?

tsdh commented 9 years ago

Also, it seems the incremental version gets worse when I increase the iteration count. I'm just running it on the model of size 4 and 1000 iterations. The first 20 iterations where pretty fast but then you could notice that every iteration started to take longer as its predecessor. Right now after 5 minutes I'm at iteration number 154, and every iteration takes about 2-4 seconds.

Also, I can see the memory footprint increases. During the firsts iterations, it used about 3% of the available RAM. Right now, it already uses almost 10%. You can see that sometimes GC kicks in cutting 1 or 2% of the used memory but it quickly fills up again.

I had expected that any incremental approach has its peak wrt memory requirements as soon as the initial model has been loaded because then the patterns have the most matches. With each fix, there are less matches thus the memory requirements should decrease a bit.

georghinkel commented 9 years ago

The Batch-version simply executes the solution as if it was plain C# code. The only difference to normal LINQ is only that some expressions are being compiled, but internally NMF Expressions simply forwards the call to the System.dll. This is very different when the solution runs in incremental mode. Here, the system spans a full-fledged dynamic dependency graph for the pattern and registers event handlers that update the dependency graph as soon as appropriate elementary update notifications drop in.

Regarding the increasing memory footprint, that is an interesting and for me problably problematic finding. This may be due to the fact that some resources are still alive while they shouldn't. This is an effect that may also be caused by a wrong compiler setting, since the compiler typically keeps objects alive longer than necessary when running in debug mode.

The debug mode is the default mode for the compiler in xbuild which is also used by MonoDevelop. Thus, if you haven't changed this setting manually, you probably compiled in debug mode. In fact, I precompiled the solution in Release mode and symlinked the Release folder into the TrainBenchmark main folder, so if you had run the solution through run.py you should have used the Release compiled version.

Anyway, yes, NMF Expressions currently has a very large impact on memory usage. We are working to improve the situation, but currently it is very heavy. In particular, these dynamic dependency graphs can get an enourmous size. This puts a lot of pressure on the GC and if some objects are marked alive while they aren't, this has an enourmous consequence.

Anyway, I will try to reproduce the behavior on my Windows machine in order to check whether it is the fault of the solution or the fault of any tool in the tool chain or its configuration. However, I am not sure I can do this before Monday.

tsdh commented 9 years ago

FWIW, I've run the exe in the Release folder.

With respect to the dependency graphs: those are the same for any iteration, so I don't see why the performance degrades with increasing iteration counts.

Isn't NMF's incremental approach similar to the one of EMF-IncQuery? With that, incrementing the iteration count will make it even faster when being compared with a batch solution.

georghinkel commented 9 years ago

Well, the incremental approaches mainly benefit from the fact that they don't have to reevaluate the entire model when things change. Thus, the benefits arise when both the model gets large and the changes get often. Since you only used a model size 4, the model size might be yet too small, but that is just my gut feeling.

Concerning the iteration counts, yes, that's true and I was wrong. You are right, of course the dependency graphs stay between subsequent iterations.

I will try to track the issue.

tsdh commented 9 years ago

Well, for significantly larger models, e.g. 32, not even the initial check phase finished until I lost patience after 20 minutes. ;-)

georghinkel commented 9 years ago

Yes, this is exactly the reason why I noted in the solution paper to run the solution on Windows. On my normal laptop (i5 with 2.8Ghz, 12GB RAM), the solution completed within that time for all sizes up to 1024. However, I am not sure I kept the tsv, if I wrote it to a file at all.

szarnyasg commented 9 years ago

We did some experiments during the conference and I can confirm that the solution on Linux/Mono is more than an order of magnitue slower than on Windows/.NET. Microsoft is working on porting the .NET CLR to Linux, but the current beta version is not yet capable of running the solution.

ftsrg / trainbenchmark-ttc

[NMF] Benchmark results #13