NVIDIA / aistore

AIStore: scalable storage for AI applications
https://aistore.nvidia.com
MIT License
1.26k stars 172 forks source link

Attach remote ais cluster but But the state of the bucket is no #186

Closed liuzhiyuan562 closed 1 month ago

liuzhiyuan562 commented 1 month ago

Describe the issue

I run two ais cluster in two machine(all in local playground),and I attach remote-ais cluter in one machine.You can get detailed information through the picture below. (I'm a beginner,and i can't find the solution in the document,please answer this question if you have time, thank you very much

Image

Links

https://aistore.nvidia.com/docs/cli/cluster#attach-remote-cluster

liuzhiyuan562 commented 1 month ago

please help~~ And i have a another question, if I haven't a aistore in computer, how can i get object from aistore in another computer.(I use curl -s -L -X GET 'http://192.168.1.125:8080/v1/objects/test/README.md' -o README.md, but it's failed)

Image

gaikwadabhishek commented 1 month ago

Hey @liuzhiyuan562

I am not sure whats wrong here. Might be a local deployment issue.

Can you do these things -

For the screenshot in the first comment, try to do a ls --all on the bucket that you are trying get the object from.

For the screenshot in the second comment, try to run the curl without -s and tell us what the error is.

Can you also run ais log show <target-name> on which target you are seeing this issue. And please send screenshot of ais show cluster on both clusters

liuzhiyuan562 commented 1 month ago

for the first comment,

Image

for the second comment, I found I can curl successfully if I have aistore in the computer. But if not, it will be failed.

Image

liuzhiyuan562 commented 1 month ago

zy@SMT:~/go_projects/src/aistore$ ais show remote-cluster UUID URL Alias Primary Smap Targets Uptime ohzKVpVwH http://192.168.1.117:8080 test2 v4 1 55m34.073806762s zy@SMT:~/go_projects/src/aistore$ AIS_ENDPOINT=http://192.168.1.117:8080 ais bucket ls NAME PRESENT ais://test yes Total: [AIS bucket: 1] ========

zy@SMT:~/go_projects/src/aistore$ AIS_ENDPOINT=http://192.168.1.117:8080 ais object put LICENSE ais://test E 10:32:15.995583 ErrBckNotFound: bucket "ais://test" does not exist: PUT /v1/objects/test/LICENSE (t[olSt8081]: htrun.go:1350 <- target.go:840 <- target.go:620]) Error: ErrBckNotFound: bucket "ais://test" does not exist zy@SMT:~/go_projects/src/aistore$ ais log show p[Ezop8080] t[olSt8081]
zy@SMT:~/go_projects/src/aistore$ ais log show t[olSt8081] Started up at 2024/09/05 11:48:39, host SMT, go1.23.0 for linux/amd64 I 11:48:39.116552 config:1881 log.dir: "/tmp/ais/1/log"; l4.proto: tcp; pub port: 8081; verbosity: 3 I 11:48:39.116559 config:1883 config: "/root/.ais1/.ais.conf"; stats_time: 10s; authentication: false; backends: [gcp aws] I 11:48:39.116574 daemon:296 Version 3.24.rc3.1fe3c80be, build 2024-09-05T11:48:36+0800, CPUs(32, runtime=32) I 11:48:39.116591 k8s:41 non-Kubernetes deployment (init k8s-client returned: 'unable to load in-cluster configuration') I 11:48:39.116999 htrun:319 6 local unicast IPs: I 11:48:39.117003 htrun:321 IP: 127.0.0.1 (MTU 65536) I 11:48:39.117004 htrun:321 IP: 192.168.1.21 (MTU 1500) I 11:48:39.117006 htrun:321 IP: 192.168.49.1 (MTU 1500) I 11:48:39.117007 htrun:321 IP: 172.17.0.1 (MTU 1500) I 11:48:39.117008 htrun:321 IP: 172.18.0.1 (MTU 1500) I 11:48:39.117009 htrun:321 IP: 10.10.10.12 (MTU 1434) W 11:48:39.117025 utils:240 given multiple choice, selecting the first IP: 127.0.0.1 (MTU 65536) I 11:48:39.117027 htrun:355 PUBLIC (user) access: {127.0.0.1 8081 http://127.0.0.1:8081 127.0.0.1:8081} W 11:48:39.117033 utils:240 given multiple choice, selecting the first IP: 127.0.0.1 (MTU 65536) I 11:48:39.117034 htrun:370 INTRA-CONTROL access: {127.0.0.1 9081 http://127.0.0.1:9081 127.0.0.1:9081} W 11:48:39.117036 utils:240 given multiple choice, selecting the first IP: 127.0.0.1 (MTU 65536) I 11:48:39.117037 htrun:383 INTRA-DATA access: {127.0.0.1 10081 http://127.0.0.1:10081 127.0.0.1:10081} I 11:48:39.117208 target:207 memory: rOst8081.gmm[(used 34GiB, free 91GiB, buffcache 26GiB, actfree 117GiB), pressure 'low', (min-free 2GiB, low-wm 52GiB] I 11:48:39.119999 dutils_linux:113 [/dev/nvme0n1p2]: [nvme0n1:512] I 11:48:39.120063 dutils_linux:113 [/dev/nvme0n1p2]: [nvme0n1:512] I 11:48:39.120105 dutils_linux:113 [/dev/nvme0n1p2]: [nvme0n1:512] I 11:48:39.120150 dutils_linux:113 [/dev/nvme0n1p2]: [nvme0n1:512] I 11:48:39.120160 vinit:91 VMD v1(rOst8081, [/home/smt/ais/mp1/1 /home/smt/ais/mp2/1 /home/smt/ais/mp3/1 /home/smt/ais/mp4/1]) I 11:48:39.120416 daemon:255 Node t[rOst8081], Version 3.24.rc3.1fe3c80be, build 2024-09-05T11:48:36+0800, CPUs(32, runtime=32)

I 11:48:40.120649 collect:51 Intra-cluster networking: fasthttp client I 11:48:40.120655 collect:52 Starting stream-collector I 11:48:40.121366 bucketmeta:365 loaded BMD v4 I 11:48:40.121381 etlmeta:219 initializing new EtlMD v0(0) I 11:48:40.121599 target:339 t[rOst8081]: loaded Smap v8[stXO8dcOj, p[aeKp8080], t=1, p=1] I 11:48:40.124635 htrun:1854 t[rOst8081]: primary responded Ok via http://127.0.0.1:9080 W 11:48:40.124775 gcp:83 unauthenticated client I 11:48:40.156771 target:176 t[rOst8081] backends: [gcp aws] I 11:48:41.125624 htrun:2039 t[rOst8081] via primary health: cluster startup Ok, Smap v8[stXO8dcOj, p[aeKp8080], t=1, p=1] I 11:48:41.125630 target:433 t[rOst8081] is ready I 11:48:41.321116 common:412 Starting targetstats I 11:48:41.321135 common_prom:71 Using Prometheus I 11:48:41.621607 kalive:124 Starting talive I 11:48:51.322026 {state.flags:6} I 11:48:51.322029 /home/smt/ais/mp3/1: used 7%, avail 1.69TiB I 11:48:53.122541 htrun:1710 msync Rx: new Smap v9[stXO8dcOj, p[aeKp8080], t=1, p=1] (have v8, aism[] <-- p[aeKp8080]) I 11:58:51.322426 /home/smt/ais/mp1/1: used 7%, avail 1.69TiB I 12:20:51.322191 nvme0n1: 0B/s, 0B, 4MiB/s, 9KiB, 10% I 12:39:11.322151 /home/smt/ais/mp3/1: used 7%, avail 1.69TiB I 12:48:41.321734 common:452 05 Sep 24 12:48 CST ============= I 13:19:41.322183 /home/smt/ais/mp1/1: used 7%, avail 1.69TiB I 13:39:51.322407 /home/smt/ais/mp3/1: used 7%, avail 1.69TiB I 13:48:41.321971 common:452 05 Sep 24 13:48 CST ============= I 14:20:11.322171 /home/smt/ais/mp1/1: used 7%, avail 1.69TiB I 14:30:21.321796 /home/smt/ais/mp1/1: used 7%, avail 1.69TiB I 14:48:41.322331 common:452 05 Sep 24 14:48 CST ============= I 15:48:51.322305 common:452 05 Sep 24 15:48 CST ============= I 16:49:01.321565 common:452 05 Sep 24 16:49 CST ============= I 17:49:11.321410 common:452 05 Sep 24 17:49 CST ============= I 18:22:21.322156 /home/smt/ais/mp4/1: used 7%, avail 1.69TiB I 18:49:11.321929 common:452 05 Sep 24 18:49 CST ============= I 19:02:41.322195 /home/smt/ais/mp4/1: used 7%, avail 1.69TiB I 19:49:11.321959 common:452 05 Sep 24 19:49 CST ============= I 20:23:11.322466 /home/smt/ais/mp1/1: used 7%, avail 1.69TiB I 20:33:21.322307 /home/smt/ais/mp1/1: used 7%, avail 1.69TiB I 20:49:21.321687 common:452 05 Sep 24 20:49 CST ============= I 20:53:31.322187 /home/smt/ais/mp2/1: used 7%, avail 1.69TiB I 21:49:31.322426 common:452 05 Sep 24 21:49 CST ============= I 22:49:41.322086 common:452 05 Sep 24 22:49 CST ============= I 23:14:41.321884 /home/smt/ais/mp1/1: used 7%, avail 1.69TiB I 23:45:11.321707 /home/smt/ais/mp1/1: used 7%, avail 1.69TiB I 23:49:51.321592 common:452 05 Sep 24 23:49 CST ============= I 00:15:21.322139 /home/smt/ais/mp2/1: used 7%, avail 1.69TiB I 00:21:41.321809 nvme0n1: 0B/s, 0B, 2MiB/s, 178KiB, 14% I 00:25:31.321897 /home/smt/ais/mp3/1: used 7%, avail 1.69TiB I 00:49:51.321762 common:452 06 Sep 24 00:49 CST ============= I 01:46:11.322143 /home/smt/ais/mp1/1: used 7%, avail 1.69TiB I 01:49:51.322374 common:452 06 Sep 24 01:49 CST ============= I 02:06:31.322192 /home/smt/ais/mp1/1: used 7%, avail 1.69TiB I 02:50:01.322216 common:452 06 Sep 24 02:50 CST ============= I 03:50:11.322265 common:452 06 Sep 24 03:50 CST ============= I 03:57:41.321936 /home/smt/ais/mp1/1: used 7%, avail 1.69TiB I 04:07:51.321430 /home/smt/ais/mp1/1: used 7%, avail 1.69TiB I 04:37:51.322015 /home/smt/ais/mp2/1: used 7%, avail 1.69TiB I 04:50:21.322235 common:452 06 Sep 24 04:50 CST ============= I 05:50:31.322292 common:452 06 Sep 24 05:50 CST ============= I 06:28:51.322110 /home/smt/ais/mp4/1: used 7%, avail 1.69TiB I 06:50:31.322309 common:452 06 Sep 24 06:50 CST ============= I 06:59:11.321459 /home/smt/ais/mp4/1: used 7%, avail 1.69TiB I 07:50:41.321578 common:452 06 Sep 24 07:50 CST ============= I 08:20:01.322121 /home/smt/ais/mp1/1: used 7%, avail 1.69TiB I 08:50:51.321997 common:452 06 Sep 24 08:50 CST ============= I 09:00:41.322327 /home/smt/ais/mp2/1: used 7%, avail 1.69TiB I 09:20:51.321754 nvme0n1: 0B/s, 0B, 4MiB/s, 9KiB, 10% I 09:41:42.710962 tgtcp:1113 msync Rx: new Conf v4[stXO8dcOj] (have v3, aism[attach] <-- p[aeKp8080]) I 09:41:42.711153 ais:125 t[rOst8081]: apply "attach" [test => [http://192.168.1.125:8080]] Conf v4 W 09:41:52.712040 ais:268 remote cluster failing to reach "test" via http://192.168.1.125:8080: timeoutError: context deadline exceeded (Client.Timeout exceeded while awaiting headers) I 09:50:51.322126 common:452 06 Sep 24 09:50 CST ============= I 09:51:21.322246 /home/smt/ais/mp1/1: used 7%, avail 1.69TiB I 10:20:42.036571 tgtcp:1113 msync Rx: new Conf v5[stXO8dcOj] (have v4, aism[attach] <-- p[aeKp8080]) I 10:20:42.036793 ais:125 t[rOst8081]: apply "attach" [test => [http://192.168.1.125:8080]]; [test1 => [http://192.168.1.125:51080]] Conf v5 W 10:20:52.037091 ais:268 remote cluster failing to reach "test" via http://192.168.1.125:8080: timeoutError: context deadline exceeded (Client.Timeout exceeded while awaiting headers) W 10:21:02.037898 ais:268 remote cluster failing to reach "test1" via http://192.168.1.125:51080: timeoutError: context deadline exceeded (Client.Timeout exceeded while awaiting headers) I 10:32:01.322253 /home/smt/ais/mp1/1: used 7%, avail 1.69TiB I 10:37:15.534118 kalive:613 Sending "suspend" on the control channel W 10:37:15.534136 tgtcp:1405 Stopping t[rOst8081]: shutdown I 10:37:15.534142 htrun:573 Shutting down HTTP I 10:37:15.534167 lcache:138 terminating --> I 10:37:16.037884 common_prom:335 Stopping targetstats, err: I 10:37:16.037896 kalive:607 Stopping talive, err: I 10:37:16.037908 collect:61 Stopping stream-collector err: I 10:37:16.037922 daemon:320 Terminated OK

another aistore zy@zy-ThinkPad-X1-Carbon-Gen-11:~/go/src/github.com/NVIDIA/aistore$ ais log show p[vjhp8080] t[yFtt8081]
zy@zy-ThinkPad-X1-Carbon-Gen-11:~/go/src/github.com/NVIDIA/aistore$ ais log show t[yFtt8081] Started up at 2024/09/09 09:35:32, host zy-ThinkPad, go1.22.3 for linux/amd64 I 09:35:32.253810 config:1881 log.dir: "/tmp/ais/1/log"; l4.proto: tcp; pub port: 8081; verbosity: 3 I 09:35:32.253821 config:1883 config: "/home/zy/.ais1/ais.json"; stats_time: 10s; authentication: false; backends: [aws gcp] I 09:35:32.253856 daemon:298 Version 3.24.rc3.b108c4202, build 2024-09-09T09:35:28+0800, CPUs(16, runtime=16) I 09:35:32.253889 k8s:41 non-Kubernetes deployment (init k8s-client returned: 'unable to load in-cluster configuration') I 09:35:32.255397 htrun:323 7 local unicast IPs: I 09:35:32.255416 htrun:325 IP: 127.0.0.1 (MTU 65536) I 09:35:32.255421 htrun:325 IP: 192.168.1.16 (MTU 1500) I 09:35:32.255423 htrun:325 IP: 10.0.3.1 (MTU 1500) I 09:35:32.255428 htrun:325 IP: 172.17.0.1 (MTU 1500) I 09:35:32.255432 htrun:325 IP: 172.18.0.1 (MTU 1500) I 09:35:32.255435 htrun:325 IP: 10.10.10.148 (MTU 1434) I 09:35:32.255439 htrun:325 IP: 192.168.49.1 (MTU 1500) W 09:35:32.255464 utils:240 given multiple choice, selecting the first IP: 127.0.0.1 (MTU 65536) I 09:35:32.255470 htrun:359 PUBLIC (user) access: {127.0.0.1 8081 http://127.0.0.1:8081 127.0.0.1:8081} W 09:35:32.255483 utils:240 given multiple choice, selecting the first IP: 127.0.0.1 (MTU 65536) I 09:35:32.255489 htrun:374 INTRA-CONTROL access: {127.0.0.1 9081 http://127.0.0.1:9081 127.0.0.1:9081} W 09:35:32.255494 utils:240 given multiple choice, selecting the first IP: 127.0.0.1 (MTU 65536) I 09:35:32.255499 htrun:387 INTRA-DATA access: {127.0.0.1 10081 http://127.0.0.1:10081 127.0.0.1:10081} I 09:35:32.255533 target:291 t[yFtt8081]: ID randomly generated I 09:35:32.255820 target:207 memory: yFtt8081.gmm[(used 16GiB, free 15GiB, buffcache 9GiB, actfree 22GiB), pressure 'low', (min-free 2GiB, low-wm 10GiB] I 09:35:32.262848 dutils_linux:113 [/dev/nvme0n1p6]: [nvme0n1:512] I 09:35:32.263073 dutils_linux:113 [/dev/nvme0n1p6]: [nvme0n1:512] I 09:35:32.263212 dutils_linux:113 [/dev/nvme0n1p6]: [nvme0n1:512] I 09:35:32.263338 dutils_linux:113 [/dev/nvme0n1p6]: [nvme0n1:512] W 09:35:32.263356 vinit:57 t[yFtt8081]: creating new VMD from [/tmp/ais/mp3/1 /tmp/ais/mp4/1 /tmp/ais/mp1/1 /tmp/ais/mp2/1] config W 09:35:32.265570 vinit:63 t[yFtt8081]: VMD v1(yFtt8081, [/tmp/ais/mp4/1 /tmp/ais/mp1/1 /tmp/ais/mp2/1 /tmp/ais/mp3/1]) initialized I 09:35:32.266531 daemon:255 Node t[yFtt8081], Version 3.24.rc3.b108c4202, build 2024-09-09T09:35:28+0800, CPUs(16, runtime=16)

I 09:35:33.267142 collect:51 Intra-cluster networking: fasthttp client I 09:35:33.267174 collect:52 Starting stream-collector W 09:35:33.268120 bucketmeta:374 initializing new BMD v0 I 09:35:33.268208 etlmeta:219 initializing new EtlMD v0(0) I 09:35:33.278586 htrun:1852 t[yFtt8081]: primary responded Ok via http://localhost:8080 W 09:35:33.278879 gcp:83 unauthenticated client I 09:35:33.309988 target:176 t[yFtt8081] backends: [gcp aws] I 09:35:46.256028 htrun:1708 msync Rx: new Smap v2[ohzKVpVwH, p[vjhp8080], t=1, p=1] (have v0, aism[] <-- p[vjhp8080]) I 09:35:46.265650 tgtcp:1112 msync Rx: new Conf v1[ohzKVpVwH] (have v0, aism[] <-- p[vjhp8080]) I 09:35:46.265944 htrun:1708 msync Rx: new Smap v4[ohzKVpVwH, p[vjhp8080], t=1, p=1] (have v2, aism[] <-- p[vjhp8080]) I 09:35:46.266212 tgtcp:705 msync Rx: new BMD v1[ohzKVpVwH (no buckets)] (have v0, aism[] <-- p[vjhp8080]) I 09:35:47.284129 htrun:2048 t[yFtt8081] via primary health: cluster startup Ok, Smap v4[ohzKVpVwH, p[vjhp8080], t=1, p=1] I 09:35:47.284150 target:433 t[yFtt8081] is ready I 09:35:47.367515 common:412 Starting targetstats I 09:35:47.367565 common_prom:71 Using Prometheus I 09:35:47.667302 kalive:125 Starting talive I 09:35:57.368404 {state.flags:6} I 09:35:57.368415 /tmp/ais/mp4/1: used 31%, avail 401.32GiB I 09:52:37.368157 {state.flags:6,disk.nvme0n1.read.bps:52838,disk.nvme0n1.util:51,disk.nvme0n1.avg.rsize:4403,disk.nvme0n1.avg.wsize:11090,disk.nvme0n1.write.bps:235110} I 09:52:37.368171 nvme0n1: 52KiB/s, 4KiB, 230KiB/s, 11KiB, 51% I 09:52:47.368659 {state.flags:6,disk.nvme0n1.util:42,disk.nvme0n1.avg.rsize:4642,disk.nvme0n1.avg.wsize:10256,disk.nvme0n1.write.bps:256410,disk.nvme0n1.read.bps:20890} I 09:52:47.368678 nvme0n1: 20KiB/s, 5KiB, 250KiB/s, 10KiB, 42% I 09:53:07.368738 {state.flags:6} I 09:59:05.783454 tgtcp:1112 msync Rx: new Conf v2[ohzKVpVwH] (have v1, aism[attach] <-- p[vjhp8080]) I 09:59:05.784039 ais:125 t[yFtt8081]: apply "attach" [test => [http://192.168.1.21:8080]] Conf v2 I 09:59:05.800048 ais:341 remote cluster (http://192.168.1.21:8080, "test", "imqVswmjI", Smap v5) added W 09:59:21.773804 tgtcp:947 t[yFtt8081]: local BMD v1 > v0 aism[list] E 09:59:21.773943 empty project ID: cannot list GCP buckets with no authentication: GET /v1/buckets (called by p[vjhp8080]) (t[yFtt8081]: htrun.go:1350 <- tgtbck.go:200 <- tgtbck.go:71 <- target.go:586]) W 09:59:39.790999 tgtcp:947 t[yFtt8081]: local BMD v1 > v0 aism[list] I 10:01:45.078730 tgtcp:725 msync Rx: new BMD v2[ohzKVpVwH, buckets: ais(1), cloud(0)] (have v1, aism[create-bck[jGwqFPEkH]] <-- p[vjhp8080]) I 10:01:45.078750 txn:251 txn-create-bck[jGwqFPEkH]-ais://test done] I 10:02:07.368463 {put.bps:1,put.n:1,put.ns:2230233,put.ns.total:2230233,put.redir.ns:9694605,put.size:19,state.flags:6} I 10:02:17.368961 {lst.n:1,lst.ns:596304,put.n:1,put.ns.total:2230233,put.size:19,state.flags:6} I 10:02:17.368983 running: list[o7LSF4Vwb] I 10:02:37.368916 {lst.n:1,put.n:1,put.ns.total:2230233,put.size:19,state.flags:6} I 10:09:17.368189 {get.bps:1,get.n:1,get.ns:196804,get.ns.total:196804,get.redir.ns:25975620,get.size:19,kalive.ns:1456651,lst.n:1,put.n:1,put.ns.total:2230233,put.size:19,state.flags:6} I 10:10:17.367824 {get.n:1,get.ns.total:196804,get.size:19,lst.n:1,put.n:1,put.ns.total:2230233,put.size:19,state.flags:6,disk.nvme0n1.read.bps:3075277,disk.nvme0n1.util:11,disk.nvme0n1.avg.rsize:94624,disk.nvme0n1.avg.wsize:9216,disk.nvme0n1.write.bps:269107} I 10:10:17.367831 nvme0n1: 3MiB/s, 92KiB, 263KiB/s, 9KiB, 11% I 10:11:17.368488 {get.n:1,get.ns.total:196804,get.size:19,lst.n:1,put.n:1,put.ns.total:2230233,put.size:19,state.flags:6} I 10:14:27.368086 {get.bps:1,get.n:2,get.ns:86868,get.ns.total:283672,get.redir.ns:524533,get.size:38,kalive.ns:1648016,lst.n:1,put.n:1,put.ns.total:2230233,put.size:19,state.flags:6} I 10:14:37.368612 {get.n:2,get.ns.total:283672,get.size:38,lst.n:1,put.n:1,put.ns.total:2230233,put.size:19,state.flags:6} W 10:23:09.873462 tgtcp:947 t[yFtt8081]: local BMD v2 > v0 aism[list] E 10:23:09.873596 empty project ID: cannot list GCP buckets with no authentication: GET /v1/buckets (called by p[vjhp8080]) (t[yFtt8081]: htrun.go:1350 <- tgtbck.go:200 <- tgtbck.go:71 <- target.go:586]) W 10:23:25.271357 tgtcp:947 t[yFtt8081]: local BMD v2 > v0 aism[list] W 10:23:29.158033 tgtcp:947 t[yFtt8081]: local BMD v2 > v0 aism[list] E 10:23:29.158126 empty project ID: cannot list GCP buckets with no authentication: GET /v1/buckets (called by p[vjhp8080]) (t[yFtt8081]: htrun.go:1350 <- tgtbck.go:200 <- tgtbck.go:71 <- target.go:586]) W 10:23:43.042520 tgtcp:947 t[yFtt8081]: local BMD v2 > v0 aism[list] I 10:26:27.368545 /tmp/ais/mp4/1: used 31%, avail 401.31GiB

liuzhiyuan562 commented 1 month ago

for the ais show cluster on both clusters

Image

liuzhiyuan562 commented 1 month ago

You can control my computer by sunloginclient if you need.

gaikwadabhishek commented 1 month ago

Hey @liuzhiyuan562 , I am not sure what you are doing. In the first screenshot, you ran a make kill which terminates the cluster before you did the curl.

Have you tested the communication between the two nodes? Host a normal HTTP webserver and try to curl from the another machine. I think there is some problem in networking.

Unfortunately, we don't debug local deployments. We are not able to triage the issue based on the information that has been provided so far.

I will also recommend that if you have multiple machines, setup k8s over it and run k8s deployment of AIStore. It has ais-operator that controls the lifecycle of AIS and related operations.

liuzhiyuan562 commented 1 month ago

Thank you! I mean why when I shut down the aistore on my local machine, why does curling the aistore on another machine fail? The second question is why when both machines are running aistore and remote-attached to each other, they cannot see each other's bucket online? And I will try to use k8s. Thank you very much.

gaikwadabhishek commented 1 month ago

The second question is why when both machines are running aistore and remote-attached to each other, they cannot see each other's bucket online?

I suspect network problems,

# on first machine
python -m http.server 8080
# on second machine
curl <ip-addr-of-machine-1>:8080

check if this works. I think there is some firewall in your local network that is blocking the connection.

liuzhiyuan562 commented 1 month ago

Image When I shut down aistore on one machine, and run python http server, and then run curl on another machine it succeeds.

liuzhiyuan562 commented 1 month ago

Image Then I opened aistore again and checked the bucket of aistore on another machine and found that it was still offline.

alex-aizman commented 1 month ago

from the log:

given multiple choice, selecting the first IP: 127.0.0.1 (MTU 65536)
I 11:48:39.117027 htrun:355 PUBLIC (user) access: {127.0.0.1 8081 http://127.0.0.1:8081/ 127.0.0.1:8081}

Not an issue. Closing.

liuzhiyuan562 commented 1 month ago

but why the PRESENT status is no?

liuzhiyuan562 commented 1 month ago

Image the issue is here, why not an issue?