voedger / kb

Knowledge base
0 stars 0 forks source link

howto: Gluster #43

Open maxim-ge opened 5 months ago

maxim-ge commented 5 months ago

Instal GlusterFS

Installing GlusterFS involves setting up multiple nodes to work together as a distributed file system. Below are the general steps to install and configure GlusterFS on a few nodes. This guide assumes you are using a Linux-based operating system.

1. Pre-Installation Setup

Before installation, make sure to perform the following steps on all nodes:

  1. Set Hostnames:

    • Assign a unique hostname to each node and ensure that the hostnames are resolvable. You can set hostnames using the hostnamectl set-hostname your-hostname command.
    • Update /etc/hosts file on each node for proper name resolution.
  2. Configure Networking:

    • Ensure that all nodes can communicate with each other over the network.
    • Configure your firewall and security settings to allow GlusterFS ports. GlusterFS requires ports 24007/tcp and 24008/tcp for the management daemon, and other ports in the range of 49152-49251/tcp for each brick.
  3. Update System and Install Required Packages:

    sudo apt-get update && sudo apt-get upgrade -y  # For Debian/Ubuntu
    sudo yum update -y                             # For RHEL/CentOS

2. Install GlusterFS

Perform the following steps on all nodes:

  1. Add GlusterFS Repository (For Ubuntu/Debian systems, adapt accordingly for RHEL/CentOS):

    sudo add-apt-repository ppa:gluster/glusterfs-x.y  # x.y is the version number
  2. Install GlusterFS Server:

    sudo apt-get update
    sudo apt-get install -y glusterfs-server
  3. Start and Enable GlusterFS Service:

    sudo systemctl start glusterd
    sudo systemctl enable glusterd

3. Configuring GlusterFS

  1. Peer Probe:

    • From one of the nodes (let's call it the primary node), probe the other nodes to add them to the trusted storage pool.
      sudo gluster peer probe node2
      sudo gluster peer probe node3
      # and so on for other nodes...
  2. Verify Peers:

    • Check the status of the peer nodes.
      sudo gluster peer status

4. Create a GlusterFS Volume

  1. Create a Directory on All Nodes:

    • This directory will be used as a brick for GlusterFS volume.
      sudo mkdir -p /data/brick1
  2. Create the Volume:

    • Create a GlusterFS volume from the primary node. The following example creates a replicated volume:
      sudo gluster volume create test-volume replica 3 node1:/data/brick1 node2:/data/brick1 node3:/data/brick1
  3. Start the Volume:

    sudo gluster volume start test-volume
  4. Check Volume Info:

    • Verify the volume information and status.
      sudo gluster volume info

5. Client Setup

  1. Install GlusterFS Client:

    • On the client nodes where you want to access the GlusterFS volume.
      sudo apt-get install -y glusterfs-client
  2. Mount the GlusterFS Volume:

    • Create a mount point and mount the GlusterFS volume.
      sudo mkdir -p /mnt/glusterfs
      sudo mount -t glusterfs node1:/test-volume /mnt/glusterfs
  3. Automount on Boot:

    • Optionally, add the volume to /etc/fstab for automatic mounting on boot.
      node1:/test-volume /mnt/glusterfs glusterfs defaults,_netdev 0 0

Remember to replace node1, node2, node3, etc., with the actual hostnames or IP addresses of your nodes, and /data/brick1, /mnt/glusterfs with your actual directory paths.

After these steps, your GlusterFS distributed file system should be up and running. Make sure to test it properly and also consider setting up backup and monitoring as per your operational requirements.

maxim-ge commented 5 months ago

Replace a failed node

Replacing a failed node in a GlusterFS cluster involves several steps to ensure that the data is rebalanced and replicated correctly across the remaining nodes. Here's how you can replace a failed node:

1. Remove the Failed Node

First, if the failed node is still part of the cluster but unresponsive, you need to detach it:

  1. Stop GlusterFS Service on Failed Node (if possible):

    sudo systemctl stop glusterd
  2. Remove the Peer from the Cluster:

    • On a healthy node, remove the failed node from the cluster:
      sudo gluster peer detach failed_node_hostname

      Replace failed_node_hostname with the actual hostname or IP address of the failed node.

2. Set Up the New Node

Prepare the new node (replacement node) to be added to the cluster:

  1. Install GlusterFS:

    • Follow the same installation steps used for the other nodes in the cluster (installing GlusterFS, starting the service, etc.).
  2. Configure Networking:

    • Ensure that the new node can communicate with all other nodes in the cluster over the required ports.

3. Add the New Node to the Cluster

  1. Add the New Node:

    • From one of the existing nodes in the cluster, probe the new node to add it to the cluster:
      sudo gluster peer probe new_node_hostname

      Replace new_node_hostname with the actual hostname or IP address of the new node.

  2. Verify Peers:

    • Check the status of the peer nodes to ensure the new node is successfully added:
      sudo gluster peer status

4. Replace Bricks and Rebalance

For each volume that the failed node was a part of, you need to replace the failed bricks (from the failed node) with new bricks on the new node and then rebalance the volume.

  1. Create New Bricks on the New Node:

    • Create the necessary directories on the new node for the bricks.
  2. Replace Bricks:

    • For each brick on the failed node, use the gluster volume replace-brick command to replace it with a new brick on the new node:
      sudo gluster volume replace-brick vol_name failed_node:/old_brick_path new_node:/new_brick_path commit force

      Replace vol_name, failed_node, old_brick_path, new_node, and new_brick_path with the appropriate volume name, hostnames, and brick paths.

  3. Start Rebalancing:

    • After replacing all the bricks, start the rebalance process:
      sudo gluster volume rebalance vol_name start

      Monitor the rebalance process with:

      sudo gluster volume rebalance vol_name status

5. Verify the Operation

  1. Check Volume Info:

    • After the rebalance operation, check the volume information:
      sudo gluster volume info vol_name
  2. Test the New Configuration:

    • Ensure that the data is accessible and that the new node is functioning correctly within the cluster.

Remember, it's crucial to monitor the cluster's health and the rebalance operation closely. If you have many volumes or a large amount of data, these operations can take a significant amount of time and might impact the performance of your GlusterFS cluster temporarily.

maxim-ge commented 5 months ago

Distribute, Replicate, and Distributed-Replicate

GlusterFS allows you to configure your storage volumes in several ways to meet different requirements for redundancy, performance, and storage capacity. The terms Distribute, Replicate, and Distributed-Replicate refer to different types of volume configurations in GlusterFS:

1. Distribute (Distributed Volumes)

Distributed volumes improve performance through parallelism. In this setup, files are spread across different bricks in the volume.

2. Replicate (Replicated Volumes)

Replicated volumes provide redundancy and high availability. In this configuration, the same data is copied to all bricks in the volume.

3. Distributed-Replicate (Distributed Replicated Volumes)

Distributed replicated volumes combine the features of both distributed and replicated volumes. They provide both data distribution across nodes (for performance) and data replication (for redundancy).

Key Points to Remember

When designing your GlusterFS storage, consider your specific needs for redundancy, performance, and storage capacity, and choose the configuration that best meets those needs.

maxim-ge commented 5 months ago

Links