ocaml-multicore / saturn

Lock-free data structures for multicore OCaml
ISC License
187 stars 28 forks source link

Lock-free bag #29

Open bartoszmodelski opened 1 year ago

bartoszmodelski commented 1 year ago

Overview

It would be useful to develop another data structure: lock-free bag. Below describes what it is, notes potential uses and links a paper with design.

Bag

Bag (or multiset) is a data structure, which stores a collection of values. Elements are inserted or removed one at a time. In contrast with queue or stack, bag has no ordering. That is remove can return the youngest, oldest or any other item currently in the bag. In principle, lack of ordering eliminates contention points and should lead to better throughput and scalability.

Uses

Design

A lock-free algorithm for concurrent bags proposes a compelling approach. See figure 4 for performance comparison with some classical structures, e.g. Michael Scott queue.

Johan511 commented 1 year ago

Hi, I am interested in working on this issue and have been trying to implement the threadBlocks as described in the paper. One issue I have faced was with the Backoff module, any advice on how I can import it into my Bag.ml file.

Also I would like some advice on how to allot a threadBlock to each thread.

polytypic commented 1 year ago

I totally agree that a scalable lock-free bag would be a useful data structure to have.

Also I would like some advice on how to allot a threadBlock to each thread.

You probably want to use DLS or Domain Local Storage.

Better throughput for Domainslib than with stack or queue (for workloads without significant locality between tasks).

I'm curious to see the result. For performing tasks, however, I realised many years ago that stack like or LIFO ordering tends to be generally preferable. The reason isn't contention. The reason is that LIFO ordering results in a Depth-First Search like behaviour, while FIFO results in a Breadth-First Search like behaviour. The crucial difference is in memory and general resource usage. Deviations from the DFS behaviour tend to increase resource usage.

In a kind of build tool that performed large numbers of largely independent tasks in parallel I initially used a queue. However, that resulted in the program running out of memory. Memory usage grew proportional to the number of tasks that could proceed if given CPU time. I changed the program to use a stack. Memory usage then stayed roughly (constant or) proportional to the number of CPU cores.