Open vkuznet opened 2 years ago
Dennis, we should not experience lock of db if we properly use transactions. My suggestion is to investigate how sqlite should be initialized that transactions should be applied to tables rather DB itself.
I think we need to triple check that all statements, rows, are closed in bulkblocks API. It would be useful to identify on which part of bulkblocks API the lock occurred, i.e. does the lock appear on specific step, if so which step it is.
Finally, according to this github issue [1] we may try the following options:
db.SetMaxOpenConns(1)
and journal_mode=WAL
which seems to address the issue with database lock.
@vkuznet I implemented the suggestions from the mattn/go-sqlite3
discussions and it seemed to resolve the database locking. I pushed my branch with the current work I have done to try to cause the racing conditions to happen in the bulkblocks API. In my logs, I see this:
[2022-10-05 14:25:16.405983633 +0000 UTC m=+3.899250753] HTTP/1.1 400 POST /dbs-one-writer/bulkblocks [data: 26088 in 1852 out] [remoteAddr: 127.0.0.1:59288] [X-Forwarded-For: ] [X-Forwarded-Host: ] [auth: no-TLS cipher-none "no-auth-cert" no-auth-login no-auth-method] [ref: "-" "Go-http-client/1.1"] [req: 1.1505ms proxy-resp: 0]
[2022-10-05 14:25:16.508029675 +0000 UTC m=+4.001296794] DBSError Code:101 Description:DBS DB error Function:dbs.bulkblocks.InsertBulkBlocksConcurrently Message:Data already exist in DBS Error: Block /unittest_web_primary_ds_name_8268_stepchain/acq_era_8268-v8268/GEN-SIM-RAW#8268 already exists Stacktrace:
goroutine 617 [running]:
github.com/dmwm/dbs2go/dbs.Error({0xc20780?, 0x400055d520?}, 0x65, {0xb31e2c, 0x19}, {0xb424ee, 0x2b})
/workspaces/dbs2go_worktrees/dbs2go_race/dbs/errors.go:172 +0x84
github.com/dmwm/dbs2go/dbs.(*API).InsertBulkBlocksConcurrently(0x40001ee000)
/workspaces/dbs2go_worktrees/dbs2go_race/dbs/bulkblocks2.go:673 +0xccc
github.com/dmwm/dbs2go/web.DBSPostHandler({0xc24910, 0x400000e0d8}, 0x4000612300, {0xb251f6, 0xa})
/workspaces/dbs2go_worktrees/dbs2go_race/web/handlers.go:542 +0x10ac
github.com/dmwm/dbs2go/web.BulkBlocksHandler({0xc24910?, 0x400000e0d8?}, 0x0?)
/workspaces/dbs2go_worktrees/dbs2go_race/web/handlers.go:957 +0x54
net/http.HandlerFunc.ServeHTTP(0x10?, {0xc24910?, 0x400000e0d8?}, 0xab00fff340a04d4c?)
/usr/local/go/src/net/http/server.go:2109 +0x38
github.com/dmwm/dbs2go/web.limitMiddleware.func1({0xc24910?, 0x400000e0d8?}, 0x400055cf50?)
/workspaces/dbs2go_worktrees/dbs2go_race/web/middlewares.go:111 +0x40
net/http.HandlerFunc.ServeHTTP(0xa58d00?, {0xc24910?, 0x400000e0d8
This states that the block already exists in the database. Is this the type of racing conditions we are trying to resolve?
I'm glad that we resolve database lock issue and now can move forward. The error you got is not a racing condition. It clearly states that you try to insert a block which is already in DB.
The racing condition should happen as following:
insertDatasetConfigurations
from https://github.com/dmwm/dbs2go/blob/2aad8db22a99f7eb7de7f4c91cfb9420bc809521/dbs/bulkblocks2.go#L38The question is how to simulate it. I think you need to create multiple JSONs with different blocks but the same common data, like dataset configuration, and then inject them concurrently. The more HTTP requests you'll send concurrently the likely you will hit the racing condition.
We should come up with integration tests which will allow to test racing conditions of
bulkblocks
API. They happen when there are competing (concurrent) calls withbulkblocks
API which provides almost identical data (the data only different at block/file level and all other parameters remain the same). In this scenario there are common data such as physics group, dataset access type, processed dataset where we either need to insert or obtain IDs, see https://github.com/dmwm/dbs2go/blob/master/dbs/bulkblocks2.go#L506-L571I identified that for small tables, like physics group or data tiers, the probability of racing conditions is kind of small, while for larger ones, like processed datasets which contains 149970 entries in ORACLE DB, there is a possibility of racing conditions which may happen if two competing HTTP requests tries to insert/check processed dataset ID.
We need to come up with integration test for this use-case.