ShadyCodes / M4R-Sci-ML

0 stars 0 forks source link

Benchmarks #3

Closed ShadyCodes closed 2 years ago

ShadyCodes commented 2 years ago

I've set up benchmarks for the reverse mode and forward mode autodiff for varying parameters.

My plan is to run benchmarks for both decents for Matrix sizes varying from 3 by 3 up to 61 by 61, with polynomial degrees from 2 to 20. Perhaps this is excessive?

Further to that, there are several plots I could do with the results once they're computed. I was thinking of plotting median and minimum run times.

Should I tone down the number of parameters searched? Right now, the setup involves benchmarking 1200 times, each taking ~5.5 seconds, so about 2 hours to run the whole thing if I continue with this

ShadyCodes commented 2 years ago

plot_14

This is an example of a plot I would consider adding. Here, each line corresponds to a different matrix size, though the final figure would be more detailed than this. I may opt to separate such a figure into two separate plots also, rather than keep them as subplots. I only wanted subplots in order to share the y axis for easy comparison, but not use the same plot to avoid clutter.

ShadyCodes commented 2 years ago

I've realised matrix size shouldn't be a varied parameter. The benchmarks are much faster now.

One thing I've noticed though is that the benchmark times decrease at the latter end of the polynomial degrees, which doesn't make sense to me.

plot_8 plot_9

Both median and minimum times decrease, and I can't figure out why. I've pushed the latest code - any idea why this might happen?

dlfivefifty commented 2 years ago

What happens if you go further? It might be that it switches algorithms after a threshold

ShadyCodes commented 2 years ago

Ah, it seems like that may be the case after all. plot_12 plot_13

The ForwardDiff results weirdly seem to plateau

dlfivefifty commented 2 years ago

I’d push it much higher, say 1000. You want to see a drastic saving for reverse

ShadyCodes commented 2 years ago

plot_12 plot_13

Looks pretty drastic now, though there is a weird spike in the forward diff times. I may run the benchmarks again tomorrow to get a cleaner graph, but I'd say it looks pretty good as is?

dlfivefifty commented 2 years ago

Awesome! Looks like it’s actually a different complexity: O(n^2) vs O(n). Can you show that that’s the case?

dlfivefifty commented 2 years ago

You are using @belapsed, correct? Assuming you aren’t doing anything in the background that should be very reliable. Note an eigenvalue algorithm will take different number of iterations depending on the matrix

ShadyCodes commented 2 years ago

How exactly would I show different complexity? Do you mean via a log-log plot, or "show" by explaining why each has a different complexity.

I am not using @belapsed. I've set up a BenchmarkGroup and split into child groups for Zygote and ForwardDiff, and from there I've set up benchmarks for every 5 polynomial degrees using @benchmarkable, running them all together after the setups. This seems easiest to do, as all of the results are collected into one object and contains the medians and minimums. @belapsed would only give one time for each degree, rather than a sample, though I could switch fairly easily if you reckon @belapsed is enough - would mean getting rid of a median time graph (though they look almost identical as is).

ShadyCodes commented 2 years ago

Also I've fixed the size of the matrix to be 21 by 21, and fixed the coefficients of a polynomial of degree n to be [1,...,n+1] - would this eigenvalue algorithm still be different even if the size of the matrix isn't changing?

dlfivefifty commented 2 years ago

I meant prove, or at least argue...

Note @belapsed runs on several samples so gives an average time which I think is sufficient.

would this eigenvalue algorithm still be different even if the size of the matrix isn't changing?

Yes: eigenvalues are an iterative algorithm and the number of iterations depends on the entries of the matrix. E.g. if you matrix is already diagonal it does zero iterations.

I would suggest showing timings for different matrix sizes as well (e.g., also do a 41 x 41)

ShadyCodes commented 2 years ago

plot_14

This took quite a while to run, several hours in fact. I opted to use the @benchmarkable set up I had before, just so I had some idea of how the code was actually progressing. I'm leaving my computer on overnight just in case I need the forward mode benchmarks again (these take a long time to run), but do you know of a way I could save the variable, in case I need it again in future? At this stage, I don't think I can really afford to run them for so long again.

In terms of why the forward mode takes orders longer, I found this thread https://discourse.julialang.org/t/automatic-differentiation-slow-slower-than-finite-differences/11209/8 The user ChrisRackauckas notes that because of the dual numbers implementation, ForwardDiff doesn't use BLAS, which I suspect could be a reason for Zygote being orders faster in this case? The Vandermonde matrix multiplying the coefficient vecftor in particular could be a reason for this. I may run the benchmarks once more with a different implementation for constructing the polynomials on the diagonal using a for loop rather than a Vandermonde matrix, to see if this may be the root cause of the huge difference in times.