NVIDIA / aistore

AIStore: scalable storage for AI applications
https://aistore.nvidia.com
MIT License
1.23k stars 164 forks source link

single docker deploy: has no disks #93

Closed shinexia closed 2 years ago

shinexia commented 2 years ago

Following docs: deploy/prod/docker/single/README.md

$ docker run \            
    -p 51080:51080 \
    -v $(mktemp -d):/ais/disk0 \
    aistore/cluster-minimal:latest
E 01:52:32.813230 vinit.go:42 FATAL ERROR: t[QIfDzhxP]: mp[/ais/disk0, fs=/dev/mapper/ubuntuvg-root] has no disks
FATAL ERROR: t[QIfDzhxP]: mp[/ais/disk0, fs=/dev/mapper/ubuntuvg-root] has no disks
VirrageS commented 2 years ago

Do you have physical disks assosiciated with your machine? Especially under /tmp (assuming you are using Linux). If that's not a problem could you post the output from lsblk?

The problem seems that the directory that is being created with $(mktemp -d) is not associated with physical disk, which is currently a requirement for running productionanized AIStore.

shinexia commented 2 years ago

the /tmp is under a mirrored lvm partition, so it is associated with TWO physical disks, not just one

$ lsblk
....
sdd                              8:48   0 223.6G  0 disk 
├─sdd1                           8:49   0   512M  0 part /boot/efi
└─sdd2                           8:50   0 223.1G  0 part 
  ├─ubuntuvg-root_rmeta_1      253:7    0     4M  0 lvm  
  │ └─ubuntuvg-root            253:11   0 220.5G  0 lvm  /
  └─ubuntuvg-root_rimage_1     253:9    0 220.5G  0 lvm  
    └─ubuntuvg-root            253:11   0 220.5G  0 lvm  /
sdf                              8:80   0 238.5G  0 disk 
├─sdf1                           8:81   0   512M  0 part 
└─sdf2                           8:82   0   238G  0 part 
  ├─ubuntuvg-root_rmeta_0      253:3    0     4M  0 lvm  
  │ └─ubuntuvg-root            253:11   0 220.5G  0 lvm  /
  └─ubuntuvg-root_rimage_0     253:5    0 220.5G  0 lvm  
    └─ubuntuvg-root            253:11   0 220.5G  0 lvm  /
...
shinexia commented 2 years ago

on another pc, which root dir is not mirror lvm partition, has the same problem.

$ docker run \                
    -p 51080:51080 \
    -v $(mktemp -d):/ais/disk0 \
    aistore/cluster-minimal:latest
E 10:35:17.183805 vinit.go:42 FATAL ERROR: t[wmrjJKcy]: mp[/ais/disk0, fs=/dev/mapper/ubuntu-root] has no disks
FATAL ERROR: t[wmrjJKcy]: mp[/ais/disk0, fs=/dev/mapper/ubuntu-root] has no disks

$ lsblk
NAME            MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
loop0             7:0    0  55.5M  1 loop /snap/core18/2284
loop1             7:1    0  55.5M  1 loop /snap/core18/2344
loop2             7:2    0  61.9M  1 loop /snap/core20/1361
loop3             7:3    0  43.6M  1 loop /snap/snapd/15177
loop4             7:4    0  67.2M  1 loop /snap/lxd/21835
loop5             7:5    0  67.9M  1 loop /snap/lxd/22526
loop6             7:6    0  43.6M  1 loop /snap/snapd/14978
loop7             7:7    0  61.9M  1 loop /snap/core20/1376
nvme0n1         259:0    0   477G  0 disk 
├─nvme0n1p1     259:1    0   512M  0 part /boot/efi
└─nvme0n1p2     259:2    0 476.4G  0 part 
  └─ubuntu-root 253:0    0 476.4G  0 lvm  /
VirrageS commented 2 years ago

which root dir is not mirror lvm partition, has the same problem.

I'm confused as the lsblk shows ubuntu-root 253:0 0 476.4G 0 lvm /. Which is lvm partition.

VirrageS commented 2 years ago

Would you mind sharing lsblk -Jt output?

shinexia commented 2 years ago

about lvm: https://www.redhat.com/sysadmin/lvm-vs-partitioning

$ lsblk -Jt
{
   "blockdevices": [
      {"name":"loop0", "alignment":0, "min-io":512, "opt-io":0, "phy-sec":512, "log-sec":512, "rota":false, "sched":"mq-deadline", "rq-size":256, "ra":128, "wsame":"0B"},
      {"name":"loop1", "alignment":0, "min-io":512, "opt-io":0, "phy-sec":512, "log-sec":512, "rota":false, "sched":"mq-deadline", "rq-size":256, "ra":128, "wsame":"0B"},
      {"name":"loop2", "alignment":0, "min-io":512, "opt-io":0, "phy-sec":512, "log-sec":512, "rota":false, "sched":"mq-deadline", "rq-size":256, "ra":128, "wsame":"0B"},
      {"name":"loop3", "alignment":0, "min-io":512, "opt-io":0, "phy-sec":512, "log-sec":512, "rota":false, "sched":"mq-deadline", "rq-size":256, "ra":128, "wsame":"0B"},
      {"name":"loop4", "alignment":0, "min-io":512, "opt-io":0, "phy-sec":512, "log-sec":512, "rota":false, "sched":"mq-deadline", "rq-size":256, "ra":128, "wsame":"0B"},
      {"name":"loop5", "alignment":0, "min-io":512, "opt-io":0, "phy-sec":512, "log-sec":512, "rota":false, "sched":"mq-deadline", "rq-size":256, "ra":128, "wsame":"0B"},
      {"name":"loop6", "alignment":0, "min-io":512, "opt-io":0, "phy-sec":512, "log-sec":512, "rota":false, "sched":"mq-deadline", "rq-size":256, "ra":128, "wsame":"0B"},
      {"name":"loop7", "alignment":0, "min-io":512, "opt-io":0, "phy-sec":512, "log-sec":512, "rota":false, "sched":"mq-deadline", "rq-size":256, "ra":128, "wsame":"0B"},
      {"name":"nvme0n1", "alignment":0, "min-io":512, "opt-io":512, "phy-sec":512, "log-sec":512, "rota":false, "sched":"none", "rq-size":1023, "ra":128, "wsame":"0B",
         "children": [
            {"name":"nvme0n1p1", "alignment":0, "min-io":512, "opt-io":512, "phy-sec":512, "log-sec":512, "rota":false, "sched":"none", "rq-size":1023, "ra":128, "wsame":"0B"},
            {"name":"nvme0n1p2", "alignment":0, "min-io":512, "opt-io":512, "phy-sec":512, "log-sec":512, "rota":false, "sched":"none", "rq-size":1023, "ra":128, "wsame":"0B",
               "children": [
                  {"name":"ubuntu-root", "alignment":0, "min-io":512, "opt-io":512, "phy-sec":512, "log-sec":512, "rota":false, "sched":null, "rq-size":128, "ra":128, "wsame":"0B"}
               ]
            }
         ]
      }
   ]
}
VirrageS commented 2 years ago

New aistore/cluster-minimal:latest image has been pushed out with the fix. Let us know if the issue still persists.

shinexia commented 2 years ago

thanks, now it's running.

VirrageS commented 2 years ago

Thank you! I will close this one. Let us know if you have further issues/questions :)

TopTea1 commented 1 year ago

Hi @VirrageS, I comment on this issue because, with the latest docker image, I have the same issue. Can you help me with that ? Thanks