Getting a "fatal error: concurrent map read and map write" but queue is supposed to be thread safe

jwkosten commented 2 years ago

I am getting a concurrency error while using this queue however it says that it is thread safe.

Attached are the logs from my app for context

2022-05-19 12:51:05.422768 I | Do delete index db file by gc. no= 5
2022-05-19 12:51:05.424399 I | Do delete index db file by gc. no= 2
2022-05-19 12:51:06.422449 I | Do delete index db file by gc. no= 6
2022-05-19 12:51:08.422487 I | Do delete index db file by gc. no= 7
2022-05-19 12:51:08.424040 I | Do delete index db file by gc. no= 3
2022-05-19 12:51:10.422745 I | Do delete index db file by gc. no= 8
2022-05-19 12:51:11.422443 I | Do delete index db file by gc. no= 9
2022-05-19 12:51:11.424039 I | Do delete index db file by gc. no= 4
2022-05-19 12:51:13.422479 I | Do delete index db file by gc. no= 10
2022-05-19 12:51:14.422876 I | Do delete index db file by gc. no= 11
2022-05-19 12:51:15.422584 I | Do delete index db file by gc. no= 5
fatal error: concurrent map read and map write

 goroutine 146 [running]:
 runtime.throw(0x1bd01c7, 0x21)
    /usr/local/go/src/runtime/panic.go:1117 +0x72 fp=0xc50d167be8 sp=0xc50d167bb8 pc=0x455912
 runtime.mapaccess1_fast64(0x1902700, 0xc00a8a8600, 0x6, 0xc40442f800)
    /usr/local/go/src/runtime/map_fast64.go:21 +0x198 fp=0xc50d167c10 sp=0xc50d167be8 pc=0x42fd18
 github.com/jhunters/bigqueue.(*DBFactory).acquireDB(0xc00a87d3b0, 0x6, 0x0, 0x0, 0x0)
    /go/pkg/mod/github.com/jhunters/bigqueue@v1.2.2/mmapfactory.go:31 +0x86 fp=0xc50d167ca0 sp=0xc50d167c10 pc=0x13151a6
github.com/jhunters/bigqueue.(*FileQueue).peek(0xc00a8ce000, 0x19c036, 0x0, 0x0, 0xc000048040, 0xc4044410e0, 0xc4044426f0)
    /go/pkg/mod/github.com/jhunters/bigqueue@v1.2.2/filequeue.go:522 +0x16b fp=0xc50d167d18 sp=0xc50d167ca0 pc=0x1312e8b
 github.com/jhunters/bigqueue.(*FileQueue).Dequeue(0xc00a8ce000, 0xc00a3c6000, 0x1e1c988, 0xc000048040, 0xc40442f7c0, 0xf, 0xc40441dc20)
    /go/pkg/mod/github.com/jhunters/bigqueue@v1.2.2/filequeue.go:408 +0x97 fp=0xc50d167d68 sp=0xc50d167d18 pc=0x1312c97

chandler-lr commented 2 years ago

I also managed to witness this issue -- it appears to occur when AutoGC is set to 1 second (though I imagine it happens at any interval), and you are rapidly dequeuing records. Both Dequeue() and when the AutoGC fires go into the acquireDB method, causing this issue. A viable workaround is to not use AutoGC and just call Gc() once you're done dequeuing records.

golikov commented 2 years ago

That error is not limited to garbage collection (although can occur due to it as well). Can also be reliably reproduced just running a cycle of enqueue and dequeue in separate go routines:

go func() {
    for {
        queue.Enqueue([]byte("queue record"))
    }
}()

go func() {
    for {
        queue.Dequeue()
    }
}()

Although there is a per row lock when updating that map index.. There is no lock here on reading: https://github.com/jhunters/bigqueue/blob/v1.2.2/mmapfactory.go#L31 I think it's possible to get some kind of race condition here and indeed have a conflict on read and write for same index.

arkbriar commented 10 months ago

The issue persists because the locks are not properly acquired. I have a fixed version which completely removes the lockMap which I don't think matters since opening DB is an operation of low frequency and will be amortized over operations.

Commit: https://github.com/jhunters/bigqueue/commit/5491f915fec6d4e9f244647d5fb776d8759f6237

jhunters / bigqueue

Getting a "fatal error: concurrent map read and map write" but queue is supposed to be thread safe #22