dragonflydb / dragonfly

A modern replacement for Redis and Memcached
https://www.dragonflydb.io/
Other
25.8k stars 950 forks source link

Failed to open the aclfile: Read-only file system #1855

Closed Mirabis closed 1 year ago

Mirabis commented 1 year ago

Describe the bug Unable to use the aclfile option. Cannot save to or load from the file.

To Reproduce Steps to reproduce the behavior:

  1. Start dragonfly with --aclfile=/etc/dragonfly/users.acl
  2. Connect to redis CLI with default and nopass
  3. ACL SETUSER username on >password +@all
  4. ACL LIST (it shows)
  5. ACL SAVE

Expected behavior ACL SAVE resulting in an updated users.acl file.

Screenshots

==> /var/log/dragonfly/dragonfly.INFO <==
I20230913 18:11:34.589900  2822 dfly_main.cc:726] Starting dragonfly df-v1.10.0-1da29a57f8f9fd21d566cb7eda40a26a13382f77
I20230913 18:11:34.590410  2822 dfly_main.cc:789] Max memory limit is: 11.50GiB
I20230913 18:11:34.596390  2824 uring_proactor.cc:157] IORing with 1024 entries, allocated 102720 bytes, cq_entries is 2048
I20230913 18:11:34.600957  2822 proactor_pool.cc:147] Running 8 io threads
W20230913 18:11:34.695436  2822 acl_family.cc:355] Error materializing acl file
I20230913 18:11:34.696393  2822 snapshot_storage.cc:96] Data directory is "/var/lib/dragonfly"
I20230913 18:11:34.696931  2822 server_family.cc:511] Loading /var/lib/dragonfly/dragonfly_dump-summary.dfs
W20230913 18:11:34.698768  2827 server_family.cc:339] Invalid cron expression: stoul
I20230913 18:11:34.711759  2826 server_family.cc:560] Load finished, num keys read: 0
I20230913 18:11:34.723317  2828 listener_interface.cc:97] sock[21] AcceptServer - listening on port 6379
E20230913 18:12:04.650123  2828 acl_family.cc:306] Failed to open the aclfile: Read-only file system

Environment (please complete the following information):

Reproducible Code Snippet

ACL SETUSER buguser on >Password +@all
ACL LIST
ACL SAVE

Additional context Have symlinked the users.acl file to redis and was able to use it there.

kostasrim commented 1 year ago

Hi @Mirabis , thank you for the bug report. So, it appears that this is a disk/fs error (Read-only file system -- it's not a permission error). I have the following questions for you:

  1. Have you run both redis and dragonfly on the same container with the same --acl argument? If so in what order? Did you first run DF and then Redis or the other way around?
  2. What's your disk usage? Can you verify that it is not full?
  3. Have you tried restarting the machine? (from a quick look on google I have seen a couple of Read-only file system errors with ProxMox LXC)?
chakaz commented 1 year ago

Hi there @Mirabis, friendly ping for the questions in https://github.com/dragonflydb/dragonfly/issues/1855#issuecomment-1718924740

Mirabis commented 1 year ago
  1. Both Redis and DragonFly are being ran on the same LXC (Debian based). Using the same ACLs and the order doesnt matter.
  2. 416MB out of 16GB for the LXC used. Overall disk size is 40GB used out of 2TB on host.
  3. I have tried restarting both the LXC and host.

I'm guessing dragonfly has issue with the "unprivileged" container mode but I currently lack time to do full debugging.

I can start Redis with an empty acl or just default. Do ACL SETUSER twice, ACL LIST -> it shows , ACL SAVE -> works fine. I can start Dragonfly with an empty acl or just default. Do ACL SETUSER twice, ACL LIST -> it shows , ACL SAVE -> permission error.

For now I'm running Redis as my use-case has very limited performance requirements.

kostasrim commented 1 year ago

@Mirabis I will try to reproduce and ping.

kostasrim commented 1 year ago

Hi @Mirabis, this was rather hard to reproduce because DF is not tested on LXC and it has some issues (at least on the containers I created which are beyond the scope of this issue). As this is a non priority, I would like you to run the following commands on your container:

  1. For redis:
    pgrep redis-server
    ps u process_number_from_above
  2. For dragonfly do the same
  3. For the file you try to write with redis and df do:
    ls -l path/to/acl/file/file

What is the output of the above commands?

Mirabis commented 1 year ago

Output below:

root@redis:~# pgrep redis-server 
171
root@redis:~# ps u 171
USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
redis        171  0.1  0.0  63304  7840 ?        Ssl  Oct03   1:56 /usr/bin/redis-server *:6379
root@redis:~# systemctl stop redis
root@redis:~# systemctl start dragonfly.service 
root@redis:~# systemctl status dragonfly.service -l
* dragonfly.service - Modern and fast key-value store
     Loaded: loaded (/lib/systemd/system/dragonfly.service; disabled; preset: enabled)
     Active: active (running) since Wed 2023-10-04 18:20:56 CEST; 7s ago
   Main PID: 716 (dragonfly)
      Tasks: 3 (limit: 153931)
     Memory: 33.8M
        CPU: 175ms
     CGroup: /system.slice/dragonfly.service
             `-716 /usr/bin/dragonfly --flagfile=/etc/dragonfly/dragonfly.conf

Oct 04 18:20:56 redis systemd[1]: Started dragonfly.service - Modern and fast key-value store.
Oct 04 18:20:57 redis dragonfly[716]: * Logs will be written to the first available of the following paths:
Oct 04 18:20:57 redis dragonfly[716]: /var/log/dragonfly/dragonfly.*
Oct 04 18:20:57 redis dragonfly[716]: * For the available flags type dragonfly [--help | --helpfull]
Oct 04 18:20:57 redis dragonfly[716]: * Documentation can be found at: https://www.dragonflydb.io/docs
root@redis:~# pgrep dragonfly 
716
root@redis:~# ps u 716
USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
dfly         716  0.0  0.2 149188 34400 ?        Ssl  18:20   0:00 /usr/bin/dragonfly --flagfile=/etc/dragonfly/dragonfly.conf
root@redis:~# systemctl stop dragonfly.service 
root@redis:~# systemctl start redis
root@redis:~# cd /etc/dragonfly/
root@redis:/etc/dragonfly# ls -l users.acl 
lrwxrwxrwx 1 root dfly 20 Sep 19 09:50 users.acl -> /etc/redis/users.acl
root@redis:/etc/dragonfly# ls -l /etc/redis/users.acl 
-rw-r----- 1 redis redis 241 Sep 19 09:49 /etc/redis/users.acl

Do note I previously tried having the same copy instead of symlinking etc, but to no avail. Even when I chmod 644 or 664.

kostasrim commented 1 year ago

Hi @Mirabis thank you for your reply. You got a permission error, which makes sense:

root@redis:/etc/dragonfly# ls -l users.acl 
lrwxrwxrwx 1 root dfly 20 Sep 19 09:50 users.acl -> /etc/redis/users.acl
root@redis:/etc/dragonfly# ls -l /etc/redis/users.acl 
-rw-r----- 1 redis redis 241 Sep 19 09:49 /etc/redis/users.acl

users.acl is symlinked to /etc/redus/users.acl. In linux, link permissions are ignored, so that lrwxrwxrwx infront of users.acl file nonsense. The permissions on /etc/redis/users.acl are clear, the triplet belongs to the owner, hence the redis user can rw to the file and that's why redis can access it because when you run redis, the user is redis (see your output) and therefore all is good. But, obviously, when you try to access the same file, under a different user dfly you get a permission error. And I am fairly certain that dfly belongs to the group of users called redis and that's why it can read it but not write to it (see the second triple it has r--).

I would suggest you to fix your file permissions. I am closing this issue since it's not DF related.

If you stumble on any other problem, plz let us know!

Mirabis commented 1 year ago

There was a "do note I even tried without symlink" - but moment I'll do that for you.

kostasrim commented 1 year ago

It's not a symlink issue, look on the file permissions. Sure you might created the file, but if the file is still not under the dfly user the problem will persist.

Mirabis commented 1 year ago

I understand where you are coming from but I tried explaining that I even tried a seperate file. I did again, see below:

First, dfly does not seem to be part of redis group - so I added it now.

root@redis:/etc/dragonfly# groups dfly 
dfly : dfly
root@redis:/etc/dragonfly# usermod -aG redis dfly
root@redis:/etc/dragonfly# groups dfly 
dfly : dfly redis

Second, trying a seperate file not symlinked but owned by dfly (debug.acl is a copy from users.acl)

root@redis:/etc/dragonfly# cat dragonfly.conf | grep "aclfile"
--aclfile=/etc/dragonfly/debug.acl
root@redis:/etc/dragonfly# ls -l
total 10
-rw-r--r-- 1 dfly dfly 241 Oct  5 10:09 debug.acl
-rw-r----- 1 dfly dfly 304 Oct  5 10:09 dragonfly.conf
lrwxrwxrwx 1 root dfly  20 Sep 19 09:50 users.acl -> /etc/redis/users.acl
root@redis:/etc/dragonfly# cat debug.acl 
user default on nopass ~* &* +@all
user firefly on #2a4f19734471f363af10{REDACTED}47e273e6bfc89ae resetchannels +@all
user tester on #d1bc8406412d8e99be{REDACTED}44a998afbb899caee155ca77e881 resetchannels +@all

Third, I start dragonfly

root@redis:/etc/dragonfly# systemctl status dragonfly.service -l
* dragonfly.service - Modern and fast key-value store
     Loaded: loaded (/lib/systemd/system/dragonfly.service; disabled; preset: enabled)
     Active: active (running) since Thu 2023-10-05 10:13:24 CEST; 2min 14s ago
   Main PID: 581 (dragonfly)
      Tasks: 3 (limit: 153931)
     Memory: 34.1M
        CPU: 205ms
     CGroup: /system.slice/dragonfly.service
             `-581 /usr/bin/dragonfly --flagfile=/etc/dragonfly/dragonfly.conf

Oct 05 10:13:24 redis systemd[1]: Started dragonfly.service - Modern and fast key-value store.
Oct 05 10:13:24 redis dragonfly[581]: * Logs will be written to the first available of the following paths:
Oct 05 10:13:24 redis dragonfly[581]: /var/log/dragonfly/dragonfly.*
Oct 05 10:13:24 redis dragonfly[581]: * For the available flags type dragonfly [--help | --helpfull]
Oct 05 10:13:24 redis dragonfly[581]: * Documentation can be found at: https://www.dragonflydb.io/docs

Fourth, I connect to it to see ACL (which is empty)

ACL LIST
1) "user default on nopass +@ALL"
> ACL SETUSER firefly on >ExamplePassword +@all 
"OK"

> ACL SETUSER tester on >ExamplePassword +@all
"OK"

> ACL SAVE
"ERR Failed to open the aclfile: Read-only file system"

So it doesn't seem to matter if I have a file owned by dfly:dfly, redis:redis, root:redis, symlink yes/no in this case. Normally it would, but it seems to be blocked from writing and reading the file by something else. Can we re-open the issue?

Just to make sure I also tried the following (and ACL SAVE afterwards)

root@redis:/etc/dragonfly# groups redis
redis : redis
root@redis:/etc/dragonfly# usermod -aG dfly redis
root@redis:/etc/dragonfly# ls -l
total 10
-rw-r--r-- 1 dfly dfly 241 Oct  5 10:09 debug.acl
-rw-r----- 1 dfly dfly 304 Oct  5 10:09 dragonfly.conf
lrwxrwxrwx 1 root dfly  20 Sep 19 09:50 users.acl -> /etc/redis/users.acl
root@redis:/etc/dragonfly# chmod 664 debug.acl 
root@redis:/etc/dragonfly# ls -l
total 10
-rw-rw-r-- 1 dfly dfly 241 Oct  5 10:09 debug.acl
-rw-r----- 1 dfly dfly 304 Oct  5 10:09 dragonfly.conf
lrwxrwxrwx 1 root dfly  20 Sep 19 09:50 users.acl -> /etc/redis/users.acl
kostasrim commented 1 year ago

Hi @Mirabis

I understand where you are coming from but I tried explaining that I even tried a seperate file

Separate file, doesn't mean separate file owned by a different user with the right permission triplets.

Thank you for providing more details so:

We do no yet support resetchannels keyword, so I assume the file is never even loaded in the first place. This is normal, DF would reject it because it can't parse it. If you look on the logs, or if you start DF with --logtostderr you should get an error Error materializing acl file. Can you try it out and let me know?

Also, can you try two more things:

  1. Who owns /etc/ ? ls -l and copy paste the line with /etc and its permissions
  2. Can you give full permissions to the file including the Other triplet and retry?
  3. Out of curiosity try creating the file in /var/log/dragonfly/dragonfly and try to load/save it from there
Mirabis commented 1 year ago
root@redis:/etc/dragonfly# ls -l /etc
...
drwxrwsr-x 2 dfly  dfly       5 Oct  5 10:51 dragonfly
drwxrws--- 2 redis redis      4 Oct  5 10:23 redis
...
root@redis:/etc/dragonfly# cd /
root@redis:/# ls -l
...
drwxr-xr-x  67 root   root    145 Oct  5 10:42 etc

changed permissions

root@redis:/etc/dragonfly# chmod 777 debug.acl 
root@redis:/etc/dragonfly# ls
debug.acl  dragonfly.conf  users.acl
root@redis:/etc/dragonfly# ls -l
total 10
-rwxrwxrwx 1 dfly dfly 213 Oct  5 10:51 debug.acl
-rw-r----- 1 dfly dfly 317 Oct  5 10:51 dragonfly.conf
lrwxrwxrwx 1 root dfly  20 Sep 19 09:50 users.acl -> /etc/redis/users.acl
root@redis:/etc/dragonfly# systemctl restart dragonfly.service 
root@redis:/etc/dragonfly# cat debug.acl 
user default on nopass ~* &* +@all
user firefly on #2a4f19734471f3{REDACTED}4c07047e273e6bfc89ae +@all
user tester on #d1bc8406412d8e9{REDACTED}98afbb899caee155ca77e881 +@all

Do note, whenever I specified ACL SAVE it automatically saved with resetchannels. Manually removed it now.

  1. Out of curiosity try creating the file in /var/log/dragonfly/dragonfly and try to load/save it from there
    
    root@redis:/etc/dragonfly# systemctl stop dragonfly.service 
    root@redis:/etc/dragonfly# mv debug.acl /var/log/dragonfly/          
    root@redis:/etc/dragonfly# nano dragonfly.conf 
    root@redis:/etc/dragonfly# ls -l /var/log/dragonfly/debug.acl 
    -rwxrwxrwx 1 dfly dfly 213 Oct  5 10:51 /var/log/dragonfly/debug.acl
    root@redis:/etc/dragonfly# systemctl start dragonfly.service 
    root@redis:/etc/dragonfly# tail -f /var/log/dragonfly/dragonfly.*
    ==> /var/log/dragonfly/dragonfly.redis.dfly.log.WARNING.20231005-102433.661 <==
    W20231005 10:24:33.838045   663 rdb_load.cc:2274] RDB has duplicated key 'laravel_database_firefly:ae03448b409a0be7' in DB 2
    W20231005 10:24:33.838068   663 rdb_load.cc:2274] RDB has duplicated key 'laravel_database_firefly:7786de2558837bde' in DB 2
    W20231005 10:24:33.838090   663 rdb_load.cc:2274] RDB has duplicated key 'laravel_database_firefly:cf26f719b9e0584b' in DB 2
    W20231005 10:24:33.838112   663 rdb_load.cc:2274] RDB has duplicated key 'laravel_database_firefly:d06f2b8f40b7fb28' in DB 2
    W20231005 10:24:33.838135   663 rdb_load.cc:2274] RDB has duplicated key 'laravel_database_firefly:ff-config-last_rt_job' in DB 2
    W20231005 10:24:33.838156   663 rdb_load.cc:2274] RDB has duplicated key 'laravel_database_firefly:f0d303e2c193b2df' in DB 2
    W20231005 10:24:33.838178   663 rdb_load.cc:2274] RDB has duplicated key 'laravel_database_firefly:c2e7c2aa4a0d9bb4' in DB 2
    W20231005 10:24:33.838199   663 rdb_load.cc:2274] RDB has duplicated key 'laravel_database_firefly:c6a6b7ac123411d1' in DB 2
    W20231005 10:24:33.838222   663 rdb_load.cc:2274] RDB has duplicated key 'laravel_database_firefly:4bc4b8152688e39e' in DB 2
    E20231005 10:24:50.711480   663 acl_family.cc:306] Failed to open the aclfile: Read-only file system
Mirabis commented 1 year ago

Apologies, afted you prompted me to add "--logtostderr" the log file didn't update. New run: No error in log

root@redis:/etc/dragonfly# tail -f /var/log/dragonfly/dragonfly.*

==> dragonfly.INFO <==
W20231005 11:01:21.235801  1034 rdb_load.cc:2274] RDB has duplicated key 'laravel_database_firefly:9f16737e5bbad0fe' in DB 2
W20231005 11:01:21.235826  1034 rdb_load.cc:2274] RDB has duplicated key 'laravel_database_firefly:df3851d407bd3687' in DB 2
W20231005 11:01:21.235848  1034 rdb_load.cc:2274] RDB has duplicated key 'laravel_database_firefly:0dea74537b5e4c40' in DB 2
W20231005 11:01:21.235872  1034 rdb_load.cc:2274] RDB has duplicated key 'laravel_database_firefly:58ddd9f997eedf10' in DB 2
W20231005 11:01:21.235894  1034 rdb_load.cc:2274] RDB has duplicated key 'laravel_database_firefly:d437dc16df90835b' in DB 2
W20231005 11:01:21.235917  1034 rdb_load.cc:2274] RDB has duplicated key 'laravel_database_firefly:da9ea8845a382df6' in DB 2
W20231005 11:01:21.235939  1034 rdb_load.cc:2274] RDB has duplicated key 'laravel_database_firefly:47e4910e922b9ffc' in DB 2
W20231005 11:01:21.235962  1034 rdb_load.cc:2274] RDB has duplicated key 'laravel_database_firefly:014d378f2a9c4ee9' in DB 2
W20231005 11:01:21.235985  1034 rdb_load.cc:2274] RDB has duplicated key 'laravel_database_firefly:c2d2b6f5756c78fb' in DB 2
W20231005 11:01:21.236008  1034 rdb_loa
==> dragonfly.WARNING <==
W20231005 11:01:21.236008  1034 rdb_load.cc:2274] RDB has duplicated key 'laravel_database_firefly:8456db1c27446090' in DB 2
W20231005 11:01:21.236038  1034 rdb_load.cc:2274] RDB has duplicated key 'laravel_database_firefly:15757d8dee6b9a98' in DB 2
W20231005 11:01:21.236061  1034 rdb_load.cc:2274] RDB has duplicated key 'laravel_database_firefly:864cb64d3362ad8f' in DB 2
W20231005 11:01:21.236084  1034 rdb_load.cc:2274] RDB has duplicated key 'laravel_database_firefly:1c30ce655d97972a' in DB 2
W20231005 11:01:21.236106  1034 rdb_load.cc:2274] RDB has duplicated key 'laravel_database_firefly:f270181e29f246a6' in DB 2
W20231005 11:01:21.236129  1034 rdb_load.cc:2274] RDB has duplicated key 'laravel_database_firefly:abdef4884e29646e' in DB 2
W20231005 11:01:21.236155  1034 rdb_load.cc:2274] RDB has duplicated key 'laravel_database_firefly:f0d303e2c193b2df' in DB 2
W20231005 11:01:21.236178  1034 rdb_load.cc:2274] RDB has duplicated key 'laravel_database_firefly:29d55a2d8a515e6a' in DB 2
W20231005 11:01:21.236202  1034 rdb_load.cc:2274] RDB has duplicated key 'laravel_database_firefly:f04562d9cb1a3a71' in DB 2
W20231005 11:01:21.236223  1034 rdb_load.cc:2274] RDB has duplicated key 'laravel_database_firefly:4bc4b8152688e39e' in DB 2

redis commands (it did not read the existing but it allowed me to save now)

> ACL LIST
1) "user default on nopass +@ALL"

> ACL SETUSER firefly on >ExamplePassword +@all
"OK"

> ACL SETUSER tester on >ExamplePassword +@all
"OK"

> ACL LIST
1) "user default on nopass +@ALL"
2) "user tester on a1cdf9b86be063b +@ALL"
3) "user firefly on a1cdf9b86be063b +@ALL"

> ACL SAVE
"OK"

file is as follows:

root@redis:/var/log/dragonfly# cat debug.acl   
ACL SETUSER default ON nopass +@ALL
ACL SETUSER tester ON >a1cdf9b86be063bced6916eb4bb31d600f23424270bd6dd9d780195d93449d52 +@ALL
ACL SETUSER firefly ON >a1cdf9b86be063bced6916eb4bb31d600f23424270bd6dd9d780195d93449d52 +@ALL

If I now stop and start dragonfly and run redis commands I get

1) "user default on nopass +@ALL"
2) "user tester on a1cdf9b86be063b +@ALL"
3) "user firefly on a1cdf9b86be063b +@ALL"

So apparently it can write/load in /var/log/dragonfly. Also, judging by the resulting file the acl files are not drop-in replacements and probably shouldn't be shared.

kostasrim commented 1 year ago

Do note, whenever I specified ACL SAVE it automatically saved with resetchannels. Manually removed it now.

we are not fully compatible with redis ACL, for example we haven't implemented ACL's on keys on pub/sub channels so the symbols: ~* &* in an ACL file won't work with DF (and DF will fail to load the file).

So apparently it can write/load in /var/log/dragonfly.

Yes, which verifies my assumption that the permissions on the directories and files are messed up. I don't know how, but you should look carefully on the paths and figure it out. This is not a DF issue, the read only filesystem message was pretty clear about this.

So apparently it can write/load in /var/log/dragonfly. Also, judging by the resulting file the acl files are not drop-in replacements and probably shouldn't be shared.

Yet. For now we only implement a subset of ACL's which we specify in our doc page. When we implement the full feature, these two files should be shareable and compatible.

I am closing this as this is not a DF issue.

Thank you for providing the information asap and we managed to go over it :)

romange commented 1 year ago

Actually, it is related to Dragonfly. @Mirabis installs a dragonfly binary using our debian package (based on https://github.com/dragonflydb/dragonfly/issues/1855#issuecomment-1747243440).

Our debian package has systemctl service file https://github.com/dragonflydb/dragonfly/blob/main/tools/packaging/debian/dragonfly.service that specifies a security perimiter for dragonfly process, specifically it does not allow writing into the "/etc/" folder.

kostasrim commented 1 year ago

@romange Is that something we would like to change ? @Mirabis I guess then the best solution for now is to not save the file in /etc folder