dragonflydb / dragonfly

A modern replacement for Redis and Memcached
https://www.dragonflydb.io/
Other
25.34k stars 913 forks source link

Cannot set unix permissions --unixsocket #1415

Closed dgaastra closed 1 year ago

dgaastra commented 1 year ago

Describe the bug --unixsocket need other users than "dfly:dfly" to access the socket.

To Reproduce --unixsocket /var/run/dragonfly/dragonfly-server.sock

Expected behavior

/run/dragonfly$ ls -la ../redis/
total 4
drwxr-sr-x  2 redis redis  80 Jun 15 12:58 .
drwxr-xr-x 26 root  root  740 Jun 15 13:59 ..
-rw-rw----  1 redis redis   4 Jun 15 12:58 redis-server.pid
srwxrw-rw-  1 redis redis   0 Jun 15 12:58 redis-server.sock

Screenshots

/run/dragonfly$ ls -la
total 4
drwxr-sr-x  2 dfly dfly  80 Jun 15 14:11 .
drwxr-xr-x 26 root root 740 Jun 15 14:11 ..
-rw-r-----  1 dfly dfly   4 Jun 15 14:11 dragonfly.pid
srwxrwx---  1 dfly dfly   0 Jun 15 14:11 dragonfly-server.sock

Environment (please complete the following information): Debian 11.7

No documentation found on "--unixsocket"

Thanks for any help

admtech commented 1 year ago

Include the web server user in the dfly group (or vice versa). Then he can also read it.

romange commented 1 year ago

@admtech what's so special about web server user? Seems that unix socket must have "rw" permission for others

admtech commented 1 year ago

It was not about the webserver user. You can simply add the server or service that accesses the socket (whether web server, PHP fpm, node, script, etc.) to the appropriate group, then you don't need to set the rw rights (which currently does not work either). Of course it would be nicer to have a config for this :-)

dgaastra commented 1 year ago

@romange Yes; it would be great if "others" could have "rw" like set in Redis with "unixsocketperm 766". On our servers, multiple websites have their own userid and need segregated access to Redis - DragonFlyDB if we can get it to work :-) @admtech The segregated users all need access - hence 'other'. Auth is completed via an ACL.

romange commented 1 year ago

Can you tell me kind of workload do you guys run? Why running dragonfly locally?

dgaastra commented 1 year ago

We are WP TOP Hosting performance freaks and help clients get 100% on everything Google Pagespeed even when using many-many slow plugins. A lot of people in Germany are very stubborn, you know... Our VMs are huge, and typically have everything all-in for easy destruction, migration, an re-building.

dgaastra commented 1 year ago

So we require a clean way to have segregated access to Redis - DragonFlyDB if we can get it to work :-)

dgaastra commented 1 year ago

The idea that DragonFly could be up to 25x faster could be very appealing to our WooCommerce clients.

admtech commented 1 year ago

We are a big german forum for it pros and developers and use Dragonfly as cache-system. We don't need network connections between the Dragonfly servers. Unixsockets are efficiency. We love the pure speed of Dragonfly, even when loading or saving the existing cache. This was a main reason for switching from Redis to Dragonfly. Now we are building a new realtime chat system with dragonfly. But without unixsocket ;-)

romange commented 1 year ago

@admtech Great to hear!

  1. would you want to talk to us and explain your use-cases/APIs for your realtime chat system?
  2. have you observed any problems or inconveniences of running Dragonfly so far? (besides UDS permissions)
  3. Where do you keep your cache? Specifically do you ever upload it to cloud (S3) storage or just keep it on disk?
romange commented 1 year ago

@dgaastra are you and @admtech from the same company? :) btw, UDS fix will be ready in the next version.

dgaastra commented 1 year ago

Hey Romange, no we are an independent independent German WordPress Tuning Hosting Shop, where we claim to offer the fastest Wordpress websites for Germany. We do this with very complex hosting stacks and rather fast servers next to Google's. For now, we have resolved the multiple user DragonFlyDB with '@' unit files where each user space has their own DragonFlyDB server. However, this added quite a toll to our stack, and, ideally, we would like DragonFlyDB to operate like MariaDB, where every user has his own database to access. Redis' ACL with pre-appending keys model is also less than desirable for us.

romange commented 1 year ago

you need namespaces to allow multi-tenancy. can we setup a meeting sometime around the beginning of July to understand how you expected that to work with security and privacy implications?

dgaastra commented 1 year ago

Sure; which timezone are you. I am not in Vancouver/Cupertino anymore, now I am in Munich.

romange commented 1 year ago

I am in Munich timezone as well 😄

dgaastra commented 1 year ago

I am in Grünwald :-)

romange commented 1 year ago

what's your email? I will send a meeting request for July 4th.

dgaastra commented 1 year ago

https://www.bio-logical-it.com/en/about-us/contact/

Sure, that would be great.

admtech commented 1 year ago

@admtech Great to hear!

  1. would you want to talk to us and explain your use-cases/APIs for your realtime chat system?

We use https://openswoole.com/ with WebSockets for the PHP backend and the Predis library for accessing the Dragonfly database. We have been developing in PHP for 30 years and love this language. The chat web client was developed in Javascript. The whole system runs in a cloud in a German data centre. The API between the WebSocket server and the Javascript client runs via Json.

  1. have you observed any problems or inconveniences of running Dragonfly so far? (besides UDS permissions)

No, we have been developing under Redis for years and since DragonflyDB is compatible with Json, Hash and Sorted Sets, it was very easy. Originally the first prototype was developed with Redis, but during tests with several thousand chat messages at the same time we had massive problems up to the point of Redis crashing. We couldn't find out the exact cause, but it looked like it was the additional modules under Redis that caused this crash. The modules seem to make Redis unstable. In addition, there is the very slow loading or saving when Redis is updated or restarted with 4 million records.

The second prototype of the chat system was then realised with Dragonfly and there were no problems with the same code. Previous tests with up to 2500 messages per second ran without problems.

The Dragonfly Db seems to be much more efficient in memory or saving. As a test, out of fun and curiosity, we imported all our records into Dragonfly. The approx. 4 million data sets generate approx. 4-5 GB under Redis (cache, content, users, comments, etc.) - but only 2.5 GB under Dragonfly. These are also loaded or saved in less than 10-15 seconds. So we changed. For the time being, our site uses the Dragonfly DB only for the cache part. It is planned to change our detail pages as well.

  1. Where do you keep your cache? Specifically do you ever upload it to cloud (S3) storage or just keep it on disk?

We have a website and cache all detail and profile pages. Our system is already running on a cloud stack in a data centre.

We save the cache as a file every hour so that it is not lost in the event of a reboot or update (in Dragonfly DB Format). In general, the cache has TTL settings so that it is self-regulating. A complete loss would not be a problem, it would rebuild itself.

So far we have not used replication because it was not important enough for a cache system. However, if we also use the Dragonfly DB for content, we will enable replication.

In our case, the cache works in such a way that when an article is loaded for the first time, it loads the structure from an ArangoDB as a Json array and stores it as a hash. However, until now there was always a conversion from array to hash. This is no longer necessary because we can store the array one-to-one as a Json array in the Dragonfly DB. This makes handling the cache easier for us, as we can simply transfer the array to the Dragonfly and vice versa. Since ArangoDB isn't known for its atomic counters (and doesn't really run well in a cloud), almost all counters are managed, counted and stored by the Redis or now the Dragonfly DB. These are then synchronised with the ArangoDB every few hours. The goal is to get rid of the ArangoDB at some point. At the moment it provides the search index and geo-search (for our job board). Only when the Dragonfly DB also supports a search index can we switch completely (but we have no time pressure, as the system has been running well for years). I have already seen progress on implementing FT.Create and FT.Search in Dragonfly DB (https://github.com/dragonflydb/dragonfly/discussions/1457)

romange commented 1 year ago

@admtech if what you cache is not a PII and if you are open to it, we could use your usecase for establishing a benchmarking suite for us. For that to work, we would need a snapshot of your data plus the monitor log for your traffic for a duration of several minutes like: redis-cli MONITOR > df_cmds.log

admtech commented 1 year ago

if what you cache is not a PII

Unfortunately, this includes PII (IP addresses, user IDs, online user info etc.). So unfortunately it doesn't work at the moment. But I am thinking about it.