roadrunner-server / roadrunner

🤯 High-performance PHP application server, process manager written in Go and powered with plugins
https://docs.roadrunner.dev
MIT License
7.83k stars 407 forks source link

[🐛 BUG]: BoltDB driver is broken after RR stops #1969

Open Kaspiman opened 1 month ago

Kaspiman commented 1 month ago

No duplicates 🥲.

What happened?

A bug happened!

Stopping the RR while processing a task can lead to BoltDB driver failure. Tasks will be lost and the ACTIVE column will contain negative values.

Version (rr --version)

rr version 2024.1.2 (build time: 2024-05-16T19:48:53+0000, go1.22.3), OS: linux, arch: amd64

How to reproduce the issue?

Check README.md file.

https://github.com/Kaspiman/roadrunner-problem-demo/tree/boltdb-problem

Relevant log output

/var/www/html # rr workers -c rr.yaml
Workers of [jobs]:
+---------+-----------+---------+---------+---------+--------------------+
|   PID   |  STATUS   |  EXECS  | MEMORY  |  CPU%   |      CREATED       |
+---------+-----------+---------+---------+---------+--------------------+
|    1226 | ready     |       0 | 16 MB   |    0.15 | 11 seconds ago     |
|    1227 | ready     |       1 | 17 MB   |    0.15 | 11 seconds ago     |
|    1228 | ready     |       1 | 16 MB   |    0.15 | 11 seconds ago     |
|    1229 | ready     |       1 | 17 MB   |    0.08 | 11 seconds ago     |
+---------+-----------+---------+---------+---------+--------------------+
Jobs of [jobs]:
+--------+----------+--------+-------+--------+---------+----------+
| STATUS | PIPELINE | DRIVER | QUEUE | ACTIVE | DELAYED | RESERVED |
+--------+----------+--------+-------+--------+---------+----------+
| READY  | boltdb   | boltdb | push  | -1     | 0       | 0        |
+--------+----------+--------+-------+--------+---------+----------+
rustatian commented 1 month ago

Hey @Kaspiman 👋 Keep in mind, that boltdb is a local development driver. It should not be used outside the local dev env and does not guarantee message deliveries, because it is based on the KV storage.

Kaspiman commented 1 month ago

Eh, I thought this was a reliable driver and a reliable database. It would be cool to have such a local queue server.

For example, o can store messages in BoltDB instead of sending them directly to RabbitMQ. Next, use a simple script to transfer messages from the BoltDB to RabbitMQ. In case of network problems and an incident with Rabbit, tasks on a separate server will not be lost.

Anyway, the driver breaks easily and cannot be fixed by restarting the server. You need to delete the db file.

rustatian commented 1 month ago

Eh, I thought this was a reliable driver and a reliable database. It would be cool to have such a local queue server.

You may use in-memory driver.

For example, o can store messages in BoltDB instead of sending them directly to RabbitMQ.

There is no problem, which you're trying to resolve (instead of some case from the top of the head) by adding another BoltDB step. Adding BoltDB will surely increase complexity and adds another concern: synchronization. And does not solve the problem of some incident with RabbitMQ.

BoltDB is a good NoSQL storage, but it is not a Queue. Generally, if you don't have a purpose of - let's break it, it works for local dev just about ok, which it was designed for.