gluster / glusterd2

[DEPRECATED] Glusterd2 is the distributed management framework to be used for GlusterFS.
GNU General Public License v2.0
167 stars 82 forks source link

Brick process didn't come up on the node post gluster node reboot #1492

Open PrasadDesala opened 5 years ago

PrasadDesala commented 5 years ago

Observed behavior

Having a single PVC (without brick-mux enabled), reboot gluster-node-1 and post reboot, brick process on gluster-node-1 is not running.

[root@gluster-kube1-0 /]# glustercli volume status
Volume : pvc-e78ffdcf-1ee1-11e9-b021-525400e2de2d
+--------------------------------------+-------------------------------+-----------------------------------------------------------------------------------------+--------+-------+------+
|               BRICK ID               |             HOST              |                                          PATH                                           | ONLINE | PORT  | PID  |
+--------------------------------------+-------------------------------+-----------------------------------------------------------------------------------------+--------+-------+------+
| 9b0246ac-274e-4ed4-822d-bdaf53f91f93 | gluster-kube3-0.glusterd2.gcs | /var/run/glusterd2/bricks/pvc-e78ffdcf-1ee1-11e9-b021-525400e2de2d/subvol1/brick1/brick | true   | 32844 | 5199 |
| 4a891bc0-d7f9-4ba7-a1b0-4d18fee9d6d5 | gluster-kube2-0.glusterd2.gcs | /var/run/glusterd2/bricks/pvc-e78ffdcf-1ee1-11e9-b021-525400e2de2d/subvol1/brick2/brick | true   | 45057 | 1848 |
| 9b6f1829-76ad-43f0-9f86-2db80ae5b367 | gluster-kube1-0.glusterd2.gcs | /var/run/glusterd2/bricks/pvc-e78ffdcf-1ee1-11e9-b021-525400e2de2d/subvol1/brick3/brick | false  |     0 |    0 |
+--------------------------------------+-------------------------------+-----------------------------------------------------------------------------------------+--------+-------+------+

Below messages are continuously seen in glusterd2 logs,

time="2019-01-23 07:51:42.185010" level=info msg="client disconnected" address="10.233.65.5:1017" server=sunrpc source="[server.go:109:sunrpc.(*SunRPC).pruneConn]"
time="2019-01-23 07:51:42.356707" level=info msg="client connected" address="10.233.66.7:1004" server=sunrpc source="[server.go:148:sunrpc.(*SunRPC).acceptLoop]" transport=tcp
time="2019-01-23 07:51:42.358309" level=error msg="registry.SearchByBrickPath() failed for brick" brick=/var/run/glusterd2/bricks/pvc-e78ffdcf-1ee1-11e9-b021-525400e2de2d/subvol1/brick3/brick error="SearchByBrickPath: port for brick /var/run/glusterd2/bricks/pvc-e78ffdcf-1ee1-11e9-b021-525400e2de2d/subvol1/brick3/brick not found" source="[rpc_prog.go:104:pmap.(*GfPortmap).PortByBrick]"
time="2019-01-23 07:51:42.359247" level=info msg="client disconnected" address="10.233.66.7:1004" server=sunrpc source="[server.go:109:sunrpc.(*SunRPC).pruneConn]"
[root@gluster-kube1-0 /]# glustercli peer status
+--------------------------------------+-----------------+-------------------------------------+-------------------------------------+--------+------+
|                  ID                  |      NAME       |          CLIENT ADDRESSES           |           PEER ADDRESSES            | ONLINE | PID  |
+--------------------------------------+-----------------+-------------------------------------+-------------------------------------+--------+------+
| 07615b77-0be0-4cf7-bfb2-448968404891 | gluster-kube3-0 | gluster-kube3-0.glusterd2.gcs:24007 | gluster-kube3-0.glusterd2.gcs:24008 | yes    | 5092 |
| 5116ce37-c13d-48af-bb32-33f64aa1858d | gluster-kube2-0 | gluster-kube2-0.glusterd2.gcs:24007 | gluster-kube2-0.glusterd2.gcs:24008 | yes    | 1689 |
| e7733b6e-8cb1-41d8-be45-bd639016be06 | gluster-kube1-0 | gluster-kube1-0.glusterd2.gcs:24007 | gluster-kube1-0.glusterd2.gcs:24008 | yes    |   32 |
+--------------------------------------+-----------------+-------------------------------------+-------------------------------------+--------+------+

Expected/desired behavior

Brick process should run on the node post gluster node reboot.

Details on how to reproduce (minimal and precise)

1) Create a 3 node GCS setup using vagrant. 2) Create a PVC (brick-mux is not enabled). 3) Reboot gluster-node-1 and check glustercli volume status on the other gluster nodes. 4) I have set "systemctl enable glusterd2.service" on the gluster node but for some reason glusterd2 process didn't come up automatically. So, reboot the node again. 5) This time glusterd2 service started automatically and check glustercli volume status.

Information about the environment:

oshankkumar commented 5 years ago

please use triple backticks(```) while pasting cli output

atinmu commented 5 years ago

Have we made progress on this? This is currently marked as GCS 1.0 blocker.