atomikos / transactions-essentials

Development repository for next major release of
https://www.atomikos.com/Main/TransactionsEssentials
Other
462 stars 139 forks source link

TransactionServiceImp optimisations #200

Closed nikhildp closed 6 months ago

nikhildp commented 8 months ago

We are using atomikos in our product. Below are some optimisations I noticed based on observations from production systems. I have converted tidToTransactionMap to concurrent hashmap as it is more efficient in handling concurrent operations. Also, thread safe objects doesn't need locks for operations. Hence removed the same for tidToTransactionMap put/read operations.

nikhildp commented 8 months ago

fixes: https://github.com/atomikos/transactions-essentials/issues/198

nikhildp commented 8 months ago

To elaborate on the issue: Problem is during high concurrency, threads keep waiting to register, as 1) There is a synchronised lock on HashMap variable tidToTransactionMap_ 2) Entire HashMap gets locked when there is a write operation. I have seen this in production application when we analyzed the threaddump incrementaly. With concurrent hashmap and without synchronised block. this waiting should go away. We are in progress with the benchmarking of this changes (with my local patch). I will have some numbers within in a week or 2 to confirm/disprove the theory

nikhildp commented 8 months ago

Here is my comparison of concurrent hashmap and synchronising on Map perf benchmarking. Doesn't come as a surprise, but Concurrent hashmap gives better performance across different configurations. Additionally latter has performance degradation with increased concurrency as there are many threads fighting for the lock simultaneously Source code here -> https://github.com/nikhildp/MapBenchMark/tree/main image

We are still in process of end to end perf benchmarking

nikhildp commented 7 months ago

image

So we have profiled this specific function in production and even with synchronisation, execution time is 0 ms on average, however, there are some threads which waited for around 200 - 500 ms consistently 9which is evident from max value). so probably for 95+ percentile, this won't be a good experience. Though minor, this change should give some improvement set_tx_to_tid.csv

GuyPardon commented 5 months ago

Thanks! We are looking into how to merge this in, so in preparation of that I have prepared the following documentation:

https://www.atomikos.com/Documentation/ConcurrencyModel

Any feedback welcome :-)

nikhildp commented 5 months ago

Thanks! We are looking into how to merge this in, so in preparation of that I have prepared the following documentation:

https://www.atomikos.com/Documentation/ConcurrencyModel

Any feedback welcome :-)

@GuyPardon : this looks good to me!