Closed Coresummer closed 3 years ago
I'm sorry for the very long wait.
I merged your pull request and modified it a little at https://github.com/herumi/bls-go-binary/tree/Coresummer-dev .
I moved your code to bls_mt.go.
I took some benchmarks, It shows slower than the original Aggregate
.
Am I using it incorrectly?
On Core i7-8700 @ 3.2GHz
% go test -bench Aggregate ./bls
sec:910bdbfa28e5c9b393a167c4ecc15a3a02362f53d3fd479e50ee9d5edcb33343
pub:8902ee34d9c96b16b496f3d607b3fd6723ff475a1f2e7eb1e6e2bc6efd5f56026280598b369d51e678a27323bf0bb4052c7c8a7affd6f576784d2bc952c257b6273ff20b8d59014d6983f41eb09e21ce4b5dd346dec240041c97c0867288a592
0. sign(abc)=3ac19c1b397fd2c1deb095906e1d7b233c1b1f420807d748e9a85834c1ecb49e82c0edd37f9bcffad0b160c0a7f63f85
1. sign(def)=c31f042ec82e72102c18516f7a001fb9d3164c07adbba66e14bf0d9add690d23e48dcff13a8fd2abe581d03c6fd7b494
2. sign(123)=43e6a0b5531f0caa4735783b4b44c35f0fdae3a0085f3432101df2b25fec45daada40eabc29f7e8cf672fc3de1d02901
goos: linux
goarch: amd64
pkg: github.com/herumi/bls-go-binary/bls
BenchmarkAggregate-12 1000000000 0.000555 ns/op
BenchmarkAggregateMT-12 1000000000 0.000787 ns/op
PASS
ok github.com/herumi/bls-go-binary/bls 2.721s
On Xeon Platinum 8280 @ 2.7GHz
% go test -bench Aggregate ./bls
sec:5fe7f8ae3e1cd51b1e9191d4415e759a6fa898355eec473d2a5f4958e184bd5f
pub:b5914f4b3b8f9f6b4c47d42fe21f853607d7053c9698795f184ef5cb5515731b38f5cf30b19507472169fbdf2c65c715b6b50a5669e04de95b27aa740c6386d242df3026208aa69a01350ffeaa52151d8e76596a07a231f3c3ffe561c80a150a
0. sign(abc)=a473e6dcfa6760473175b1a6f36e7cafa439089aafe39ff4e2f153e17bf088c675c2aa3866b8a828297cf5eb20d1f608
1. sign(def)=8d57f4e49955168c89747b3c46c07a8810981f3bd8b960518e8c1000d7f7f13ffcab035bbc41b1a600ff99afeb7c438c
2. sign(123)=47bd4a572bb566d1adabef91e8b01d09c3e24068767af323264df01254966cc9b6046ccf8383b3af903a95c476f08993
goos: linux
goarch: amd64
pkg: github.com/herumi/bls-go-binary/bls
BenchmarkAggregate-112 1000000000 0.000655 ns/op
BenchmarkAggregateMT-112 1000000000 0.00150 ns/op
PASS
ok github.com/herumi/bls-go-binary/bls 3.502s
So much appreciation for the benchmark and sorry for my late reply. I tested on i9-9900, it seems slower than the regular Aggregate function too. Although, on the Raspberry Pi4, when there are more than 1000 signatures. The AggregateMT function gave me faster result than the regular one. I'm currently still trying to figure out the reason why, but no clue yet. I'm wondering if you can try the same when you got time, just in case if you don't mind.
if there are more than two signatures in the channel then we call goroutine C.blsSignatureAdd(sig1.v,sig2v) and push back the result sig1 back to the channel after its done. Since the proportion of multi threading overhead and C.blsSignatureAdd() is different on multi-platform, this method works faster on low spec & multi-core cpu. I'll provide a implementation example by using pull request and evaluated data for both Aggregate() & AggregateMT(MT stands for Multi-threading) running on raspbery pi 4. Welcome to have a look and test yourself.
Regards