jchambers / java-otp

A one-time password (HOTP/TOTP) library for Java
MIT License
452 stars 122 forks source link

Suggestion: in HmacOneTimePasswordGenerator, make the buffer a local variable #34

Closed BenTels closed 2 years ago

BenTels commented 2 years ago

If you allocate a buffer locally in generateOneTimePassword() rather than as a field, then generateOneTimePassword() doesn't have to be synchronized. This will improve throughput in multi-threaded situations.

jchambers commented 2 years ago

Thank you for the suggestion, but I don't think this is the right way forward.

Reallocating the buffer on every call will generate unnecessary GC pressure and decrease throughput in cases where the OTP generator is not contested by multiple threads.

If multi-threaded throughput is a priority, callers can create multiple thread-local OTP generator instances. That gives all the benefits of an uncontested generator without the cost of repeated allocations (albeit at the cost of ThreadLocal overhead), and so the performance penalty is avoidable. By contrast, the performance penalty in an allocation-per-call model would be unavoidable.

Again, I appreciate the suggestion! I recognize there may be more to do to document best practices in a multi-threaded environment; let's look into those instead of changing the allocation model.

BenTels commented 2 years ago

I think you're vastly overestimating the pressure garbage collection will place on the system, but okay.

jchambers commented 2 years ago

Well, maybe so. I ran a benchmark and it looks like the difference isn't statistically significant.

In the process, though, I remembered the other, much more important reason why things are designed the way they are: the Mac instance. Mac operations are not thread safe, and so we either need some way to enforce exclusive access (which we're doing now by synchronizing) or we need a new Mac for every invocation. Reallocating a buffer is, indeed, pretty cheap. Reallocating a Mac is more expensive, and that does result in a throughput drop in the 10-15% range in my benchmark runs. It also complicates the NoSuchAlgorithmException-handling story a bit (there are ways forward; they're just a little clumsier than we're doing now).

So, fair enough: I should have run the benchmarks before making an argument about the performance effects of a design change. That said, now that I'm armed with benchmark results (and a fresher memory of the situation), I stand by the decision to leave things as they are.

jchambers commented 2 years ago

Hey, @BenTels, I was thinking about this a bit more and think I found a good way forward. If the spirit moves you, I'd be curious what you think of #42.