concurrent disj plus without zzz

deosjr commented 4 years ago

Showing that concurrent disjunction can improve performance. Usecase is again the Einstein puzzle (see other PR). Benchmarks were run for four different cases, cartesian product of

disjunction with or without concurrency
wrapping each goal in a suspend, using Zzz macro

Results are as follows:

- Disj with Zzz: ~1.55s
- Disj without Zzz: ~0.67s
- Concurrent disj with Zzz: ~1.73s
- Concurrent disj without Zzz: ~0.58s

Mainly shows that Zzz implementation is slow. Using Zzz by default (as in the 2013 ukanren paper) makes concurrent disjunction even slower in this particular case. But for the same problem, not wrapping goals in Zzz performs faster using concurrent disjunction.

deosjr commented 4 years ago

I will add some benchmarks/tests in the next PR, have been testing with Einstein puzzle but this PR was created before that was merged. Too lazy to rebase, I think it's easier this way.

awalterschulze commented 4 years ago

Pretty easy to rebase "manually", it is a single new file. Copy the contents into a buffer. Checkout master Past the contents into a file. At least this how I work sometimes, but don't tell anyone ;P

deosjr commented 4 years ago

Sorry I was being too lazy, excitement got the better of me :) I'll add some benchmarks in here

awalterschulze commented 4 years ago

Thank you and I get it, if it was harder to rebase I would have let you have it :)

deosjr commented 4 years ago

go test ./... -count=1 -bench=. -benchtime=10s
BenchmarkEinsteinSeqZzz-12             7    1441313213 ns/op
BenchmarkEinsteinSeq-12               20     566709797 ns/op
BenchmarkEinsteinConcZzz-12            7    1658366004 ns/op
BenchmarkEinsteinConc-12              22     513559079 ns/op

deosjr commented 4 years ago

And here's the same benchmarks but without GC. Not sure how disabling gc affects subsequent tests though, so maybe this is not usable at all. Difference between Zzz and no Zzz almost vanishes, which tells us a lot of overhead there is in gc I guess.

GOGC=off go test ./... -run Einstein -count=1 -bench=. -benchtime=10s
BenchmarkEinsteinSeqZzz-12            10    1032888122 ns/op
BenchmarkEinsteinSeq-12               21     968266362 ns/op
BenchmarkEinsteinConcZzz-12           13    1191946316 ns/op
BenchmarkEinsteinConc-12              30     846420367 ns/op

awalterschulze commented 4 years ago

Don't forget to add a test to compensate for the arithmetic.

deosjr commented 4 years ago

BenchmarkMembero10000-12                  51     196983889 ns/op
BenchmarkConcMembero10000-12              67     176656504 ns/op
BenchmarkMembero50000-12                   3    4204859737 ns/op
BenchmarkConcMembero50000-12               3    4556661907 ns/op

deosjr commented 4 years ago

This of course grew out of hand again, this time due to me trying to find benchmarks for cases that show concurrency actually helps. How do you feel about making a separate folder examples, where einstein, membero, mapo, fibonacci and friends can live? Split it out and clean it up a little again?

awalterschulze commented 4 years ago

Examples sound great and if we make that step it might be nice to include a readme with each. Something like https://github.com/awalterschulze/goderive/tree/master/example/plugin/compare

We don't have to do it all in one go, but simply putting each example in its own folder keeps this option open for later. What do you think?

deosjr commented 4 years ago

With b.ResetTimer() (I don't think it changed anything tbh):

go test ./... -bench=. -benchtime=20s 
BenchmarkMembero10000-12                     145     164182606 ns/op
BenchmarkConcMembero10000-12                 132     176788059 ns/op
BenchmarkMembero50000-12                       5    4161610718 ns/op
BenchmarkConcMembero50000-12                   5    4450750625 ns/op
BenchmarkMapo10000-12                          8    2552913177 ns/op
BenchmarkConcMapo10000-12                      8    2596927527 ns/op
BenchmarkMapoFailHead10000-12                  8    2531982400 ns/op
BenchmarkConcMapoFailHead10000-12              7    3001633959 ns/op
BenchmarkMapoFailTail10000-12                  8    2550604447 ns/op
BenchmarkConcMapoFailTail10000-12              7    2956605783 ns/op
BenchmarkEinsteinSeqZzz-12                    30     706279739 ns/op
BenchmarkEinsteinSeq-12                       48     476706788 ns/op
BenchmarkEinsteinConcZzz-12                   27     820328077 ns/op
BenchmarkEinsteinConc-12                      55     441792931 ns/op

awalterschulze commented 4 years ago

This is very interesting, how the concurrency doesn't really help. I am kind of puzzled, is count=1 really what we want here? Does that change anything, if we remove it

go test ./... -count=1 -bench=. -benchtime=10s

awalterschulze / gominikanren

concurrent disj plus without zzz #12