canonical / microceph

MicroCeph is snap-deployed Ceph with built-in clustering
https://snapcraft.io/microceph
GNU Affero General Public License v3.0
210 stars 33 forks source link

Increase disk space on my primary system disk on a node, it looks like the snap services cannot start #281

Open djcollis opened 9 months ago

djcollis commented 9 months ago

I have a microceph question. I setup a 3 node cluster, got them all linked up running microk8s and microceph. I then started adding all my deployments and I started to get a health warning about low drive space. After some investigation turns out the disk (primary) that is running kube and the system stuff was low. I doubled the size of this disk (using my suppliers dash setup) and rebooted the now. now the space is there, the warning has gone. Microk8s is healthy and see's all nodes up. I can see all the nodes in microceph status, however the node I added the disk space on is now out of quorum. I tried running 'sudo systemctl restart ceph-mon@' but I got a unit not found error.

image

snap service status:

snap.microceph.mon.service - Service for snap application microceph.mon Loaded: loaded (/etc/systemd/system/snap.microceph.mon.service; enabled; vendor preset: enabled) Active: failed (Result: exit-code) since Wed 2024-01-03 13:00:24 UTC; 8min ago Process: 1457999 ExecStart=/usr/bin/snap run microceph.mon (code=exited, status=1/FAILURE) Main PID: 1457999 (code=exited, status=1/FAILURE) CPU: 53ms

Jan 03 13:00:24 milan7763-14616 systemd[1]: snap.microceph.mon.service: Scheduled restart job, restart counter is at 5. Jan 03 13:00:24 milan7763-14616 systemd[1]: Stopped Service for snap application microceph.mon. Jan 03 13:00:24 milan7763-14616 systemd[1]: snap.microceph.mon.service: Start request repeated too quickly. Jan 03 13:00:24 milan7763-14616 systemd[1]: snap.microceph.mon.service: Failed with result 'exit-code'. Jan 03 13:00:24 milan7763-14616 systemd[ journalctl -u snap microceph logs.txt 1]: Failed to start Service for snap application microceph.mon. microceph common logs.txt

Could this be a networking issue to the primary node?

djcollis commented 9 months ago

I did a check on the config and the microk8s nodes and the IP's look correct? so looks like the nodes can see each other

root@milan7763-14616:/var/snap/microceph/common/logs# sudo microceph cluster config list +---+-----------------------------+-------------+ | # | KEY | VALUE | +---+-----------------------------+-------------+ | 0 | cluster_network | 10.5.0.0/16 | +---+-----------------------------+-------------+ | 1 | osd_pool_default_crush_rule | 2 | +---+-----------------------------+-------------+ root@milan7763-14616:/var/snap/microceph/common/logs# microk8s status microk8s is running high-availability: yes datastore master nodes: 10.5.0.3:19001 10.5.0.1:19001 10.5.0.2:19001 datastore standby nodes: none

I can ping all the node ip's from the bad node

UtkarshBhatthere commented 9 months ago
Jan 03 13:00:24 milan7763-14616 microceph.mon[1457946]: 2024-01-03T13:00:24.228+0000 7f2f46afcd40 -1 monitor data directory at '/var/lib/ceph/mon/ceph-milan7763-14616' does not exist: have you run 'mkfs'?

I was suspecting this, and i reckon if a snap refresh would do the trick. Could you please try doing (assuming you are on the quincy track) sudo snap refresh microceph --channel quincy/edge and confirm whether this fixes the quorum issue for your cluster ?

djcollis commented 9 months ago

Hi There Utkarsh looks like the same issue here after the restart.

image

That file is still missing in action.

djcollis commented 8 months ago

hi there - any news on this? I am happy to provide more data or do a screenshare if you would like? Let me know how I can help.

djcollis commented 8 months ago

Update - I have updated the cluster to use release --reef/stable I also added another node. It looks connected but I cannot add any disks on it. I ran some logs on the mon service: https://pastebin.com/fgB3sxuC

here are some other logs:

here are the errors from trying to add the new node disk and fetching the config:

microceph cluster config list Error: failed to fetch cluster config: Failed to run: ceph config dump -f json-pretty: exit status 1 (Error initializing cluster client: ObjectNotFound('RADOS object not found (error calling conf_read_file)')), Key: root@milan7763-15006:~# sudo microceph disk add /dev/vda --wipe Error: failed adding new disk: Failed to generate OSD keyring: Failed to run: ceph auth get-or-create osd.4 mgr allow profile osd mon allow profile osd osd allow * -o /var/snap/microceph/common/data/osd/ceph-4/keyring: exit status 1 (Error initializing cluster client: ObjectNotFound('RADOS object not found (error calling conf_read_file)'))

On this node however, If I run microceph status I do get a completed 4 nodes (just without the disk on the new one)

MicroCeph deployment summary:

Milan7763-14616 (102.214.11.189) Services: mds, mgr, mon, osd Disks: 1 Milan7763-14619 (102.214.11.226) Services: mds, mgr, mon, osd Disks: 1 Milan7763-14623 (102.214.10.37) Services: mds, mgr, mon, osd Disks: 1 milan7763-15006 (102.214.10.187) Services: Disks: 0 Then on the existing nodes when i try fetch the config I get this: microceph cluster config list Error: failed to fetch cluster config: Get "http://control.socket/1.0/configs": context deadline exceeded, Key: This is consistent across all 3 of the existing nodes.

sudo microceph.ceph -s:

new node: Error initializing cluster client: ObjectNotFound('RADOS object not found (error calling conf_read_file)') This must be related the the node not finding the cluster config. Existing nodes: - timing out: 2024-01-17T07:57:54.694+0000 7fc3b090e640 0 monclient(hunting): authenticate timed out after 300

UtkarshBhatthere commented 8 months ago

Aah, I get the confusion around cluster configs now.

ObjectNotFound('RADOS object not found (error calling conf_read_file)')

This error implies that ceph cluster is unable to find the ceph.conf file.

Can you share the output of the following command ? (It SHOULD contain the following files.)

# ls /var/snap/microceph/current/conf
ceph.client.admin.keyring  ceph.conf  ceph.keyring  metadata.yaml
djcollis commented 8 months ago

ok no problem. looks like we missing the keyring:

ceph.conf ceph.keyring metadata.yaml

image

The new node only has:

metadata.yaml

I had a look at the ceph.conf files in the existing nodes, and all the node public IP's are correct and i can ping them from all the nodes. I did create an internal network with a local IP setup to use. I see this is not being used here, not a problem. but when I setup the 3 nodes originally, this seemed to be the best way to get the cluster to spin up was to use the internal IP's in the ceph cluster config.

UtkarshBhatthere commented 8 months ago

Hmm, MicroCeph uses network subnet as the default public network (can be configured using the --public-network flag at cluster bootstrap). Ceph services NEED access to the configured public network, hence, the fourth node should also be attached to the same.

For the missing admin keyring, you can manually create that using the ceph.keyring file: here is mine for reference

root@ubuntu-ceph-VLWH-MQIA:/var/snap/microceph/current/conf# cat ceph.keyring 
# Generated by MicroCeph, DO NOT EDIT.
[client.admin]
    key = AQBbNaZl9ikmGxAAFqCXe0nsXMUotJT/vNOt7Q==
root@ubuntu-ceph-VLWH-MQIA:/var/snap/microceph/current/conf# cat ceph.client.admin.keyring 
[client.admin]
    key = AQBbNaZl9ikmGxAAFqCXe0nsXMUotJT/vNOt7Q==
    caps mds = "allow *"
    caps mgr = "allow *"
    caps mon = "allow *"
    caps osd = "allow *"
UtkarshBhatthere commented 8 months ago

You may also query the MicroCeph internal db for the same:

sudo microceph cluster sql "select * from config"

This will provide you with the configs like public_network, admin keyrings etc.

djcollis commented 8 months ago

OK thanks. I have added in the missing files in each node, including the new node. If I try run a ceph -s I get the following errors: root@milan7763-14619:/var/snap/microceph/current/conf# ceph -s 2024-01-17T09:02:20.379+0000 7fd5711df640 -1 auth: error parsing file /etc/ceph/ceph.client.admin.keyring: error setting modifier for [client.admin] type=key val=key = AQBYdHVlzubcLhAAF4iHqrJGWc4/1znIaARBXw==: Malformed input 2024-01-17T09:02:20.379+0000 7fd5711df640 -1 auth: failed to load /etc/ceph/ceph.client.admin.keyring: (5) Input/output error 2024-01-17T09:02:20.379+0000 7fd5711df640 -1 auth: error parsing file /etc/ceph/ceph.client.admin.keyring: error setting modifier for [client.admin] type=key val=key = AQBYdHVlzubcLhAAF4iHqrJGWc4/1znIaARBXw==: Malformed input 2024-01-17T09:02:20.379+0000 7fd5711df640 -1 auth: failed to load /etc/ceph/ceph.client.admin.keyring: (5) Input/output error 2024-01-17T09:02:20.379+0000 7fd5711df640 -1 auth: error parsing file /etc/ceph/ceph.client.admin.keyring: error setting modifier for [client.admin] type=key val=key = AQBYdHVlzubcLhAAF4iHqrJGWc4/1znIaARBXw==: Malformed input 2024-01-17T09:02:20.379+0000 7fd5711df640 -1 auth: failed to load /etc/ceph/ceph.client.admin.keyring: (5) Input/output error 2024-01-17T09:02:20.379+0000 7fd5711df640 -1 monclient: keyring not found [errno 5] RADOS I/O error (error connecting to the cluster) root@milan7763-14619:/var/snap/microceph/current/conf# ls ceph.client.admin.keyring ceph.conf ceph.keyring metadata.yaml root@milan7763-14619:/var/snap/microceph/current/conf# cat ceph.client.admin.keyring [client.admin] key = key = AQBYdHVlzubcLhAAF4iHqrJGWc4/1znIaARBXw== caps mds = "allow " caps mgr = "allow " caps mon = "allow " caps osd = "allow "

I see /etc/ceph/ folder does not exist root@milan7763-14616:~# cd /etc/ceph/ -bash: cd: /etc/ceph/: No such file or directory

Are there a lot of places that the admin keyring file needs to be?

djcollis commented 8 months ago

+----+----------------------+------------------------------------------+ | id | key | value | +----+----------------------+------------------------------------------+ | 1 | fsid | 0f7a4eea-7f71-4480-b2d0-cf34432e8ac8 | | 2 | keyring.client.admin | AQBYdHVlzubcLhAAF4iHqrJGWc4/1znIaARBXw== | +----+----------------------+------------------------------------------+

Does this mean there is no public network? or is this fsid supposed to be linking to that network config?

UtkarshBhatthere commented 8 months ago

OK thanks. I have added in the missing files in each node, including the new node. If I try run a ceph -s I get the following errors: root@milan7763-14619:/var/snap/microceph/current/conf# ceph -s 2024-01-17T09:02:20.379+0000 7fd5711df640 -1 auth: error parsing file /etc/ceph/ceph.client.admin.keyring: error setting modifier for [client.admin] type=key val=key = AQBYdHVlzubcLhAAF4iHqrJGWc4/1znIaARBXw==: Malformed input 2024-01-17T09:02:20.379+0000 7fd5711df640 -1 auth: failed to load /etc/ceph/ceph.client.admin.keyring: (5) Input/output error 2024-01-17T09:02:20.379+0000 7fd5711df640 -1 auth: error parsing file /etc/ceph/ceph.client.admin.keyring: error setting modifier for [client.admin] type=key val=key = AQBYdHVlzubcLhAAF4iHqrJGWc4/1znIaARBXw==: Malformed input 2024-01-17T09:02:20.379+0000 7fd5711df640 -1 auth: failed to load /etc/ceph/ceph.client.admin.keyring: (5) Input/output error 2024-01-17T09:02:20.379+0000 7fd5711df640 -1 auth: error parsing file /etc/ceph/ceph.client.admin.keyring: error setting modifier for [client.admin] type=key val=key = AQBYdHVlzubcLhAAF4iHqrJGWc4/1znIaARBXw==: Malformed input 2024-01-17T09:02:20.379+0000 7fd5711df640 -1 auth: failed to load /etc/ceph/ceph.client.admin.keyring: (5) Input/output error 2024-01-17T09:02:20.379+0000 7fd5711df640 -1 monclient: keyring not found [errno 5] RADOS I/O error (error connecting to the cluster) root@milan7763-14619:/var/snap/microceph/current/conf# ls ceph.client.admin.keyring ceph.conf ceph.keyring metadata.yaml root@milan7763-14619:/var/snap/microceph/current/conf# cat ceph.client.admin.keyring [client.admin] key = key = AQBYdHVlzubcLhAAF4iHqrJGWc4/1znIaARBXw== caps mds = "allow " caps mgr = "allow " caps mon = "allow " caps osd = "allow "

I see /etc/ceph/ folder does not exist root@milan7763-14616:~# cd /etc/ceph/ -bash: cd: /etc/ceph/: No such file or directory

Are there a lot of places that the admin keyring file needs to be?

I can see that the keyring file is malformed, i.e. has key = key =. Can you please try again with this fixed ? Also, no the keyring file should only be maintained in the snap conf directory (/var/snap/microceph/current/conf)

UtkarshBhatthere commented 8 months ago

+----+----------------------+------------------------------------------+ | id | key | value | +----+----------------------+------------------------------------------+ | 1 | fsid | 0f7a4eea-7f71-4480-b2d0-cf34432e8ac8 | | 2 | keyring.client.admin | AQBYdHVlzubcLhAAF4iHqrJGWc4/1znIaARBXw== | +----+----------------------+------------------------------------------+

Does this mean there is no public network? or is this fsid supposed to be linking to that network config?

Probably running an older revision. The pubilc_network config was added in a recent feature work. Are you using edge or stable ?

djcollis commented 8 months ago

OK, looks like that key = key = was only in the one node, I have sorted that out. I have also increased the other two node's disks, so now there should be no space issues at all.

when I orginally installed I used edge, but I refreshed all the snap versions to reef/stable this morning.

UtkarshBhatthere commented 8 months ago

Great, and how does your deployment look like now ? Any improvements ?

djcollis commented 8 months ago

Hi there,

Not yet. it still isn't able to start the snap services. It also looks like the config is gone. so when i do the microcpeh status all 4 nodes are there and connected.

If I try run sudo microceph.ceph -s i get an authentication timed out.

I cannot pull the config for the cluster which I think is maybe causing issues? something is still not lekker. all the volumes are down in my kube cluster, so the whole setup is down still. but I think it is all related. I am wondering if I would need to start from the beginning and add a new cluster? I just find it strange that microceph status works, but ceph status picks up a bunch of auth issues. I guess it is looking for a bit more detail.

I do need to read more on the ceph documentation though, at the moment I am sort of fumbling my way around.

UtkarshBhatthere commented 8 months ago

Well the MicroCeph status would work because MicroCeph uses information persisted in the distributed control database (dqlite, the slq db i suggested you to query). It is not generated using the ceph status itself. If you need to recreate the ceph conf file this is how it looks.

root@ubuntu-ceph-VLWH-MQIA:~# cat /var/snap/microceph/current/conf/ceph.conf 
# # Generated by MicroCeph, DO NOT EDIT.
[global]
run dir = /var/snap/microceph/x1/run
fsid = dee2d362-f156-4120-9110-61d2eabb65e6
mon host = 10.80.87.72
public_network = 10.80.87.72/24
auth allow insecure global id reclaim = false
ms bind ipv4 = true
ms bind ipv6 = false

It is strange though, MicroCeph should recreate the conf file if it goes missing.

UtkarshBhatthere commented 8 months ago

Redeploying the MicroCeph cluster would be a way around. If you decide to, I would suggest you to use edge channels (reef/edge or quincy/edge). Also, pre-planning the disk size on nodes, the public network etc would greatly stabilise the operation.

djcollis commented 8 months ago

OK - indeed. This is the learning curve I am going on. The app was a lot heavier on storage than I anticipated. The disk for the actualy storage is certainly sized enough, however the disk for all the kube operating stuff I originally had at 25gig but needed to increase that to 50gig which seems to have caused a lot of issues.

From the release version, yesterday it was recommended to use reef/stable. I will refresh all nodes to reef/edge and see if that helps kick start these connections.

Maybe I got the networking side wrong, my node external IP's are:

102.214.10.187 102.214.11.189 102.214.11.226 102.214.10.37

and I created a network with ips: 10.0.5.1, 10.0.5.2, 10.0.5.3, 10.0.5.4

I can ping all IP's from all nodes. my microk8s was all linked using network 10.0.5.0/16 and I had this in the ceph config. but I see all the ceph.conf uses the external IP's.

I am not sure if this is a problem or not?

djcollis commented 8 months ago

I just had a look at the size requirements for Ceph. I wonder if this is why things are not auto-healing? I am running 4nodes, each 4cpu with 8gig of ram. 50gig primary disk and 75gig secondary disk which is getting assigned to the microceph cluster.

After reading the requirements, this sounds super low. Although the application we are running only needs about 30gig of storage. Is this a case of not fit for use?

After seeing the rook-ceph add on in microk8s, I thought this might be worth using. as microk8s is setup light enough to run on a raspberry pi even. we have a document integration tool which requires mysql, mongodb and redis loaded in the kube and I was looking for a distributed storage option for our clusters. do you think I am looking in the right place here using microceph, or is this application too small for this type of storage solution?

djcollis commented 8 months ago

Last question on this. If I uninstall microceph, and do a reinstall start the process again. If I add the disks without the --wipe flag, will it be able to pick up on the data that was left from the previous installation?