EmerisHQ / demeris-backend

Monorepo containing all the Demeris backend code and infrastructure definitions.
GNU Affero General Public License v3.0
8 stars 1 forks source link

Don't use CockroachDB transactions in Tracelistener #783

Open Pitasi opened 2 years ago

Pitasi commented 2 years ago

My assumption from what I can tell from the tracelistener code is this:

And we do that by calling this emeris-utils method: https://github.com/EmerisHQ/emeris-utils/blob/main/database/database.go#L58.

Transactions are way more expensive than simple INSERT so we probably should avoid them if they are not really necessary.

Full docs is here: https://www.cockroachlabs.com/docs/stable/transactions.html.


* there is a limit on how many params a single query can have. That's why we have this "split" logic in place: https://github.com/EmerisHQ/tracelistener/blob/2a8bed11a512a52039bc6bf1a157672fdcedb03a/tracelistener/tracelistener.go#L140-L142

This takes a single "WritebackOp" (= an INSERT or DELETE query) and splits it into multiple chunked queries.

I can't tell how often this happen. It is possible that some blocks have so many data in it that requires multiple queries but even in these cases I'm not sure a transaction is truly needed.

If it is, we can have an if that runs a transaction only for these multiple-queries cases, while having the happy path without transaction.

DoD

tbruyelle commented 2 years ago

I modified the test TestWritebackOp_ChunkingWorks which inserts a lot of rows to compare with or without transactions.

I updated the number of inserts to 500.000, and added a small timer around each insert loop (one loop with tx, and one loop without). On my machine the result are clear:

TX  4.457864967s
NO TX  2.44812082s

The only benefit of having this transaction as I understood is only the retry on error, but I have no idea if this affects TL or not.

Pitasi commented 2 years ago

The only benefit of having this transaction as I understood is only the retry on error

The "retry on errors" of the cockroachdb-go library only concerns transactions, no transactions = no errors = no need to retry 😁

tbruyelle commented 2 years ago

Ok then let's go :)

tbruyelle commented 2 years ago

If it is, we can have an if that runs a transaction only for these multiple-queries cases, while having the happy path without transaction.

Even in that case, I don't believe transactions are required because multi-queries data are totally unrelated.