Closed wari closed 9 years ago
Oh, and also, the results are more or less the same with GOMAXPROCS
set to 1 or 8, etc. Does not seem to matter at all.
I just pushed 901c439dfe4a1c2f483c70ec0d703cc8219c9680 which benchmarks the HMAC signing method. It seems to behave as expected. Go ahead and run it for yourself and let me know what you see. I'm not familiar with the performance of parallel Go on arm chips. Could be something there?
Artax:jwt-go dgrijalva$ go test -bench . -cpu 1
PASS
BenchmarkHMACSigning 50000 41500 ns/op
ok github.com/dgrijalva/jwt-go 3.619s
Artax:jwt-go dgrijalva$ go test -bench . -cpu 2
PASS
BenchmarkHMACSigning-2 100000 24001 ns/op
ok github.com/dgrijalva/jwt-go 3.785s
Artax:jwt-go dgrijalva$ go test -bench . -cpu 4
PASS
BenchmarkHMACSigning-4 100000 14235 ns/op
ok github.com/dgrijalva/jwt-go 2.723s
Artax:jwt-go dgrijalva$ go test -bench . -cpu 16
PASS
BenchmarkHMACSigning-16 100000 15301 ns/op
ok github.com/dgrijalva/jwt-go 2.816s
Artax:jwt-go dgrijalva$ go test -bench . -cpu 1 -benchtime 5s
PASS
BenchmarkHS256Signing 1000000 6329 ns/op
BenchmarkHS384Signing 1000000 7121 ns/op
BenchmarkHS512Signing 1000000 7725 ns/op
BenchmarkRS256Signing 5000000 3339 ns/op
BenchmarkRS384Signing 5000000 3331 ns/op
BenchmarkRS512Signing 5000000 3327 ns/op
ok github.com/dgrijalva/jwt-go 82.747s
Artax:jwt-go dgrijalva$ go test -bench . -cpu 2 -benchtime 5s
PASS
BenchmarkHS256Signing-2 5000000 3692 ns/op
BenchmarkHS384Signing-2 2000000 4287 ns/op
BenchmarkHS512Signing-2 2000000 4494 ns/op
BenchmarkRS256Signing-2 5000000 2019 ns/op
BenchmarkRS384Signing-2 5000000 1975 ns/op
BenchmarkRS512Signing-2 5000000 1967 ns/op
ok github.com/dgrijalva/jwt-go 85.383s
Artax:jwt-go dgrijalva$ go test -bench . -cpu 4 -benchtime 5s
PASS
BenchmarkHS256Signing-4 5000000 2716 ns/op
BenchmarkHS384Signing-4 5000000 3127 ns/op
BenchmarkHS512Signing-4 5000000 3150 ns/op
BenchmarkRS256Signing-4 10000000 1390 ns/op
BenchmarkRS384Signing-4 10000000 1399 ns/op
BenchmarkRS512Signing-4 10000000 1420 ns/op
ok github.com/dgrijalva/jwt-go 101.527s
Let me try this out. Will get back to you as soon as I can. Thanks
Actually, your numbers are way off than mine. For this test I'm using a MacBook pro. The RS numbers should not look so fast (like the one you're having) IMO, or is there something wrong with my machine? Or it could be you have them fast accelerated crypto cards?:
jwt-go git:master ❯ go test -bench . -cpu 1 -benchtime 5s
PASS
BenchmarkHS256Signing 1000000 9059 ns/op
BenchmarkHS384Signing 1000000 10039 ns/op
BenchmarkHS512Signing 1000000 10403 ns/op
BenchmarkRS256Signing 1000 9082389 ns/op
BenchmarkRS384Signing 1000 9231708 ns/op
BenchmarkRS512Signing 1000 9142034 ns/op
ok github.com/dgrijalva/jwt-go 65.712s
jwt-go git:master ❯ go test -bench . -cpu 2 -benchtime 5s
PASS
BenchmarkHS256Signing-2 1000000 5392 ns/op
BenchmarkHS384Signing-2 1000000 5938 ns/op
BenchmarkHS512Signing-2 1000000 6461 ns/op
BenchmarkRS256Signing-2 2000 4783875 ns/op
BenchmarkRS384Signing-2 2000 4811723 ns/op
BenchmarkRS512Signing-2 2000 4799514 ns/op
ok github.com/dgrijalva/jwt-go 64.270s
jwt-go git:master ❯ go test -bench . -cpu 4 -benchtime 5s
PASS
BenchmarkHS256Signing-4 2000000 3518 ns/op
BenchmarkHS384Signing-4 2000000 4565 ns/op
BenchmarkHS512Signing-4 2000000 4604 ns/op
BenchmarkRS256Signing-4 2000 2965792 ns/op
BenchmarkRS384Signing-4 2000 2942990 ns/op
BenchmarkRS512Signing-4 2000 2941127 ns/op
ok github.com/dgrijalva/jwt-go 58.278s
This is what the benchmark looked like on the server in question (intel, ubuntu):
wari@salt:~/src/github.com/dgrijalva/jwt-go$ go test -bench . -cpu 1 -benchtime 5s
PASS
BenchmarkHS256Signing 500000 17296 ns/op
BenchmarkHS384Signing 500000 19656 ns/op
BenchmarkHS512Signing 300000 21234 ns/op
BenchmarkRS256Signing 500 14924048 ns/op
BenchmarkRS384Signing 500 15312750 ns/op
BenchmarkRS512Signing 500 14867267 ns/op
ok github.com/dgrijalva/jwt-go 67.362s
wari@salt:~/src/github.com/dgrijalva/jwt-go$ go test -bench . -cpu 2 -benchtime 5s
PASS
BenchmarkHS256Signing-2 500000 14936 ns/op
BenchmarkHS384Signing-2 1000000 15754 ns/op
BenchmarkHS512Signing-2 500000 17111 ns/op
BenchmarkRS256Signing-2 500 10069299 ns/op
BenchmarkRS384Signing-2 500 10971741 ns/op
BenchmarkRS512Signing-2 500 10811426 ns/op
ok github.com/dgrijalva/jwt-go 56.678s
wari@salt:~/src/github.com/dgrijalva/jwt-go$ go test -bench . -cpu 4 -benchtime 5s
PASS
BenchmarkHS256Signing-4 1000000 11716 ns/op
BenchmarkHS384Signing-4 500000 14896 ns/op
BenchmarkHS512Signing-4 500000 15034 ns/op
BenchmarkRS256Signing-4 1000 7667207 ns/op
BenchmarkRS384Signing-4 1000 8193615 ns/op
BenchmarkRS512Signing-4 1000 7391248 ns/op
ok github.com/dgrijalva/jwt-go 66.040s
BTW, I've got my very informal test here at https://github.com/wari/jwt-testing which I believe is more "real world" at least in my use case.
So, what I gather from my own test is that GOMAXPROCS matters a lot. For 10 requests, I can shave off time to sign by half or a third. But one thing to note is that they all finish around the same time. If GOMAXPROCS is set to 1, if you sign 10 concurrently, or serially, the net effect is you will get the results at about the same time.
PS: Again, your numbers looks way off, and also I don't really understand the go benchmark :)
The numbers I posted were wrong. The signing was failing and I wasn't checking for it. I fixed the test, but forgot to update the ticket.
-dave
On Apr 12, 2015, at 8:24 PM, Wari Wahab notifications@github.com wrote:
BTW, I've got my very informal test here at https://github.com/wari/jwt-testing which I believe is more "real world" at least in my use case.
So, what I gather from my own test is that GOMAXPROCS matters a lot. For 10 requests, I can shave off time to sign by half or a third. But one thing to note is that they all finish around the same time. If GOMAXPROCS is set to 1, if you sign 10 concurrently, or serially, the net effect is you will get the results at about the same time.
PS: Again, your numbers looks way off, and also I don't really understand the go benchmark :)
— Reply to this email directly or view it on GitHub.
Looks like you are seeing the benefits of multiple CPUs here. Look at the ns/op values as you increase CPUs.
-dave
On Apr 12, 2015, at 7:33 PM, Wari Wahab notifications@github.com wrote:
Actually, your numbers are way off than mine. For this test I'm using a MacBook pro. The RS numbers should not look so fast (like the one you're having) IMO, or is there something wrong with my machine? Or it could be you have them fast accelerated crypto cards?:
jwt-go git:master ❯ go test -bench . -cpu 1 -benchtime 5s PASS BenchmarkHS256Signing 1000000 9059 ns/op BenchmarkHS384Signing 1000000 10039 ns/op BenchmarkHS512Signing 1000000 10403 ns/op BenchmarkRS256Signing 1000 9082389 ns/op BenchmarkRS384Signing 1000 9231708 ns/op BenchmarkRS512Signing 1000 9142034 ns/op ok github.com/dgrijalva/jwt-go 65.712s jwt-go git:master ❯ go test -bench . -cpu 2 -benchtime 5s PASS BenchmarkHS256Signing-2 1000000 5392 ns/op BenchmarkHS384Signing-2 1000000 5938 ns/op BenchmarkHS512Signing-2 1000000 6461 ns/op BenchmarkRS256Signing-2 2000 4783875 ns/op BenchmarkRS384Signing-2 2000 4811723 ns/op BenchmarkRS512Signing-2 2000 4799514 ns/op ok github.com/dgrijalva/jwt-go 64.270s jwt-go git:master ❯ go test -bench . -cpu 4 -benchtime 5s PASS BenchmarkHS256Signing-4 2000000 3518 ns/op BenchmarkHS384Signing-4 2000000 4565 ns/op BenchmarkHS512Signing-4 2000000 4604 ns/op BenchmarkRS256Signing-4 2000 2965792 ns/op BenchmarkRS384Signing-4 2000 2942990 ns/op BenchmarkRS512Signing-4 2000 2941127 ns/op ok github.com/dgrijalva/jwt-go 58.278s This is what the benchmark looked like on the server in question (intel, ubuntu):
wari@salt:~/src/github.com/dgrijalva/jwt-go$ go test -bench . -cpu 1 -benchtime 5s PASS BenchmarkHS256Signing 500000 17296 ns/op BenchmarkHS384Signing 500000 19656 ns/op BenchmarkHS512Signing 300000 21234 ns/op BenchmarkRS256Signing 500 14924048 ns/op BenchmarkRS384Signing 500 15312750 ns/op BenchmarkRS512Signing 500 14867267 ns/op ok github.com/dgrijalva/jwt-go 67.362s wari@salt:~/src/github.com/dgrijalva/jwt-go$ go test -bench . -cpu 2 -benchtime 5s PASS BenchmarkHS256Signing-2 500000 14936 ns/op BenchmarkHS384Signing-2 1000000 15754 ns/op BenchmarkHS512Signing-2 500000 17111 ns/op BenchmarkRS256Signing-2 500 10069299 ns/op BenchmarkRS384Signing-2 500 10971741 ns/op BenchmarkRS512Signing-2 500 10811426 ns/op ok github.com/dgrijalva/jwt-go 56.678s wari@salt:~/src/github.com/dgrijalva/jwt-go$ go test -bench . -cpu 4 -benchtime 5s PASS BenchmarkHS256Signing-4 1000000 11716 ns/op BenchmarkHS384Signing-4 500000 14896 ns/op BenchmarkHS512Signing-4 500000 15034 ns/op BenchmarkRS256Signing-4 1000 7667207 ns/op BenchmarkRS384Signing-4 1000 8193615 ns/op BenchmarkRS512Signing-4 1000 7391248 ns/op ok github.com/dgrijalva/jwt-go 66.040s — Reply to this email directly or view it on GitHub.
Anyway, thanks a lot for the benchmark code, I now know to add GOMAXPROCS
to my server, it is still very heavy in CPU usage, and if I do run into the problem of having loaded with too many calls at once I might use the HS256 signing instead of RSA signing.
Oh, and in my dumbness, the reason why I wrote in the first place is due to the fact that I used GOMAXPROC
(without the 'S
') to run.
Thanks again :smile:
Hi there, just a note, not sure if this would be a bug, or just a normal behaviour that's to be expected with signing tokens via RSxxx.
On my system, where I need to sign one or many token at once. When I sign a token, it may take roughly 200-250ms for
jwt.SignedString()
. Of course, depending on what machine I use, it may be faster or slower due to CPU.But when I tried signing 3 different tokens at the same time, using the same privateKey, the results are almost 3x longer.
The source for my implementation in psuedocode-ish looks like this:
So for this, all the other stuff took only a few ms, but the goroutines will take a total of 600 to 900 ms on a request that produces 3 iterations.
Here's what the verbose log looks like
So, the question is, is this expected behaviour that I'm observing? I would be afraid to do this on hundreds of simultaneous token creations with RSA.
Just a note, the
getBackData()
parts are actually tokens signed with HS256 over a network, and the slowest one that took 142ms (BBBB), is actually from a Raspberry Pi. Which means, it took less time to verify an RSA signature, and generate a SignedString() with HS256 than to actually sign with RSA on a faster machine.