Closed willzhang closed 1 year ago
It seems DRBD did not load after the reboot. Can you show the logs of the drbd-module-loader
container:
kubectl -n piraeus-datastore logs -f node1 -c drbd-module-loader
root@node1:~# kubectl -n piraeus-datastore logs -f node1 -c drbd-module-loader
DRBD module is already loaded
DRBD version loaded:
version: 8.4.11 (api:1/proto:86-101)
srcversion: 98E710E58B3041F3046305B
and node2 node3 have many logs
root@node1:~# kubectl -n piraeus-datastore logs -f node2 -c drbd-module-loader
Need a git checkout to regenerate drbd/.drbd_git_revision
make[1]: Entering directory '/tmp/pkg/drbd-9.2.3/drbd'
Calling toplevel makefile of kernel source tree, which I believe is in
KDIR=/lib/modules/5.15.0-78-generic/build
make -C /lib/modules/5.15.0-78-generic/build M=/tmp/pkg/drbd-9.2.3/drbd modules
warning: the compiler differs from the one used to build the kernel
The kernel was built by: gcc (Ubuntu 11.3.0-1ubuntu1~22.04.1) 11.3.0
You are using: gcc (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0
COMPAT __vmalloc_has_2_params
COMPAT add_disk_returns_int
COMPAT before_4_13_kernel_read
COMPAT bio_alloc_has_4_params
COMPAT blkdev_issue_discard_takes_flags
COMPAT blkdev_issue_zeroout_discard
COMPAT can_include_vermagic_h
COMPAT dax_direct_access_takes_mode
COMPAT fs_dax_get_by_bdev_takes_start_off
COMPAT fs_dax_get_by_bdev_takes_start_off_and_holder
COMPAT genl_policy_in_ops
COMPAT have_BIO_MAX_VECS
COMPAT have_CRYPTO_TFM_NEED_KEY
COMPAT have_GENHD_FL_NO_PART
COMPAT have_SHASH_DESC_ON_STACK
COMPAT have_WB_congested_enum
COMPAT have_allow_kernel_signal
COMPAT have_bdev_discard_granularity
COMPAT have_bdev_max_discard_sectors
COMPAT have_bdev_nr_sectors
COMPAT have_bdevname
COMPAT have_bdgrab
COMPAT have_bdi_congested
COMPAT have_bdi_congested_fn
COMPAT have_bio_alloc_clone
COMPAT have_bio_bi_bdev
COMPAT have_bio_bi_error
COMPAT have_bio_bi_opf
COMPAT have_bio_bi_status
COMPAT have_bio_clone_fast
COMPAT have_bio_op_shift
COMPAT have_bio_set_dev
COMPAT have_bio_set_op_attrs
COMPAT have_bio_split_to_limits
COMPAT have_bio_start_io_acct
COMPAT have_bioset_init
COMPAT have_bioset_need_bvecs
COMPAT have_blk_alloc_disk
COMPAT have_blk_alloc_queue_rh
COMPAT have_blk_check_plugged
COMPAT have_blk_cleanup_disk
COMPAT have_blk_qc_t_make_request
COMPAT have_blk_qc_t_submit_bio
COMPAT have_blk_queue_flag_set
COMPAT have_blk_queue_make_request
COMPAT have_blk_queue_max_write_same_sectors
COMPAT have_blk_queue_merge_bvec
COMPAT have_blk_queue_split_bio
COMPAT have_blk_queue_split_q_bio
COMPAT have_blk_queue_split_q_bio_bioset
COMPAT have_blk_queue_update_readahead
COMPAT have_blk_queue_write_cache
COMPAT have_bvec_kmap_local
COMPAT have_d_inode
COMPAT have_disk_update_readahead
COMPAT have_fallthrough
COMPAT have_fs_dax_get_by_bdev
COMPAT have_generic_start_io_acct_q_rw_sect_part
COMPAT have_generic_start_io_acct_rw_sect_part
COMPAT have_get_random_u32
COMPAT have_get_random_u32_below
COMPAT have_hd_struct
COMPAT have_ib_cq_init_attr
COMPAT have_ib_get_dma_mr
COMPAT have_idr_is_empty
COMPAT have_inode_lock
COMPAT have_ktime_to_timespec64
COMPAT have_kvfree
COMPAT have_kvfree_rcu
COMPAT have_list_is_first
COMPAT have_list_next_entry
COMPAT have_max_send_recv_sge
COMPAT have_nla_nest_start_noflag
COMPAT have_nla_parse_deprecated
COMPAT have_nla_put_64bit
COMPAT have_nla_strscpy
COMPAT have_part_stat_h
COMPAT have_part_stat_read_accum
COMPAT have_pointer_backing_dev_info
COMPAT have_proc_create_single
COMPAT have_queue_flag_discard
COMPAT have_queue_flag_stable_writes
COMPAT have_rb_declare_callbacks_max
COMPAT have_refcount_inc
COMPAT have_req_hardbarrier
COMPAT have_req_noidle
COMPAT have_req_nounmap
COMPAT have_req_op_write
COMPAT have_req_op_write_zeroes
COMPAT have_req_write
COMPAT have_revalidate_disk_size
COMPAT have_sched_set_fifo
COMPAT have_sched_signal_h
COMPAT have_security_netlink_recv
COMPAT have_sendpage_ok
COMPAT have_set_capacity_and_notify
COMPAT have_shash_desc_zero
COMPAT have_simple_positive
COMPAT have_sock_set_keepalive
COMPAT have_strscpy
COMPAT have_struct_bvec_iter
COMPAT have_struct_size
COMPAT have_submit_bio_noacct
COMPAT have_tcp_sock_set_cork
COMPAT have_tcp_sock_set_keepcnt
COMPAT have_tcp_sock_set_keepidle
COMPAT have_tcp_sock_set_nodelay
COMPAT have_tcp_sock_set_quickack
COMPAT have_time64_to_tm
COMPAT have_timer_setup
COMPAT have_void_make_request
COMPAT have_void_submit_bio
COMPAT ib_alloc_pd_has_2_params
COMPAT ib_device_has_ops
COMPAT ib_post_send_const_params
COMPAT ib_query_device_has_3_params
COMPAT need_drbd_wrappers
COMPAT need_make_request_recursion
COMPAT need_skb_abort_seq_read
COMPAT part_stat_read_takes_block_device
COMPAT queue_limits_has_discard_zeroes_data
COMPAT rdma_create_id_has_net_ns
COMPAT rdma_reject_has_reason_arg
COMPAT sk_data_ready_has_1_param
COMPAT sock_create_kern_has_netns_parameter
COMPAT sock_ops_returns_addr_len
COMPAT struct_gendisk_has_backing_dev_info
UPD /tmp/pkg/drbd-9.2.3/drbd/compat.5.15.99.h
UPD /tmp/pkg/drbd-9.2.3/drbd/compat.h
make[4]: 'drbd-kernel-compat/cocci_cache/618a16740b5c8d49f5fd464218ac2850/compat.patch' is up to date.
PATCH
patching file ./drbd_int.h
patching file drbd_transport_tcp.c
patching file drbd_state.c
patching file drbd_req.c
patching file drbd_receiver.c
patching file drbd_nl.c
patching file drbd_main.c
patching file drbd_debugfs.c
patching file drbd_dax_pmem.c
patching file drbd_bitmap.c
patching file drbd_actlog.c
CC [M] /tmp/pkg/drbd-9.2.3/drbd/drbd_dax_pmem.o
CC [M] /tmp/pkg/drbd-9.2.3/drbd/drbd_debugfs.o
CC [M] /tmp/pkg/drbd-9.2.3/drbd/drbd_bitmap.o
CC [M] /tmp/pkg/drbd-9.2.3/drbd/drbd_proc.o
CC [M] /tmp/pkg/drbd-9.2.3/drbd/drbd_sender.o
CC [M] /tmp/pkg/drbd-9.2.3/drbd/drbd_receiver.o
CC [M] /tmp/pkg/drbd-9.2.3/drbd/drbd_req.o
CC [M] /tmp/pkg/drbd-9.2.3/drbd/drbd_actlog.o
CC [M] /tmp/pkg/drbd-9.2.3/drbd/drbd_main.o
CC [M] /tmp/pkg/drbd-9.2.3/drbd/drbd_strings.o
CC [M] /tmp/pkg/drbd-9.2.3/drbd/drbd_nl.o
CC [M] /tmp/pkg/drbd-9.2.3/drbd/drbd_interval.o
CC [M] /tmp/pkg/drbd-9.2.3/drbd/drbd_state.o
CC [M] /tmp/pkg/drbd-9.2.3/drbd/drbd-kernel-compat/drbd_wrappers.o
CC [M] /tmp/pkg/drbd-9.2.3/drbd/drbd_nla.o
CC [M] /tmp/pkg/drbd-9.2.3/drbd/drbd_transport.o
GEN /tmp/pkg/drbd-9.2.3/drbd/drbd_buildtag.c
CC [M] /tmp/pkg/drbd-9.2.3/drbd/drbd_buildtag.o
LD [M] /tmp/pkg/drbd-9.2.3/drbd/drbd.o
CC [M] /tmp/pkg/drbd-9.2.3/drbd/drbd_transport_tcp.o
CC [M] /tmp/pkg/drbd-9.2.3/drbd/drbd_transport_rdma.o
MODPOST /tmp/pkg/drbd-9.2.3/drbd/Module.symvers
CC [M] /tmp/pkg/drbd-9.2.3/drbd/drbd.mod.o
LD [M] /tmp/pkg/drbd-9.2.3/drbd/drbd.ko
BTF [M] /tmp/pkg/drbd-9.2.3/drbd/drbd.ko
Skipping BTF generation for /tmp/pkg/drbd-9.2.3/drbd/drbd.ko due to unavailability of vmlinux
CC [M] /tmp/pkg/drbd-9.2.3/drbd/drbd_transport_rdma.mod.o
LD [M] /tmp/pkg/drbd-9.2.3/drbd/drbd_transport_rdma.ko
BTF [M] /tmp/pkg/drbd-9.2.3/drbd/drbd_transport_rdma.ko
Skipping BTF generation for /tmp/pkg/drbd-9.2.3/drbd/drbd_transport_rdma.ko due to unavailability of vmlinux
CC [M] /tmp/pkg/drbd-9.2.3/drbd/drbd_transport_tcp.mod.o
LD [M] /tmp/pkg/drbd-9.2.3/drbd/drbd_transport_tcp.ko
Skipping BTF generation for /tmp/pkg/drbd-9.2.3/drbd/drbd_transport_tcp.ko due to unavailability of vmlinux
BTF [M] /tmp/pkg/drbd-9.2.3/drbd/drbd_transport_tcp.ko
mv .drbd_kernelrelease.new .drbd_kernelrelease
Memorizing module configuration ... done.
make[1]: Leaving directory '/tmp/pkg/drbd-9.2.3/drbd'
Module build was successful.
=======================================================================
With DRBD module version 8.4.5, we split out the management tools
into their own repository at https://github.com/LINBIT/drbd-utils
(tarball at http://links.linbit.com/drbd-download)
That started out as "drbd-utils version 8.9.0",
has a different release cycle,
and provides compatible drbdadm, drbdsetup and drbdmeta tools
for DRBD module versions 8.3, 8.4 and 9.
Again: to manage DRBD 9 kernel modules and above,
you want drbd-utils >= 9.3 from above url.
=======================================================================
DRBD version loaded:
version: 9.2.3 (api:2/proto:86-122)
GIT-hash: c142ca1280c41aee1330b980544ef276330ff6ef build by @node2, 2023-08-10 06:14:21
Transports (api:18): tcp (9.2.3) rdma (9.2.3)
That is an unsupported version, which should not be loaded at boot time. You might need to systemctl disable drbd.service
on the host.
Then, run rmmod drbd
on node1 and delete the node1 Pod, so it restarts again.
yes, i do that, all pods start and running, but why it use version: 8.4.11
, is this a bug?
root@node1:~# kubectl -n piraeus-datastore logs -f node1 -c drbd-module-loader
Need a git checkout to regenerate drbd/.drbd_git_revision
make[1]: Entering directory '/tmp/pkg/drbd-9.2.3/drbd'
Calling toplevel makefile of kernel source tree, which I believe is in
KDIR=/lib/modules/5.15.0-78-generic/build
make -C /lib/modules/5.15.0-78-generic/build M=/tmp/pkg/drbd-9.2.3/drbd modules
warning: the compiler differs from the one used to build the kernel
The kernel was built by: gcc (Ubuntu 11.3.0-1ubuntu1~22.04.1) 11.3.0
You are using: gcc (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0
COMPAT __vmalloc_has_2_params
COMPAT add_disk_returns_int
COMPAT before_4_13_kernel_read
COMPAT bio_alloc_has_4_params
COMPAT blkdev_issue_discard_takes_flags
COMPAT blkdev_issue_zeroout_discard
COMPAT can_include_vermagic_h
COMPAT dax_direct_access_takes_mode
COMPAT fs_dax_get_by_bdev_takes_start_off
COMPAT fs_dax_get_by_bdev_takes_start_off_and_holder
COMPAT genl_policy_in_ops
COMPAT have_BIO_MAX_VECS
COMPAT have_CRYPTO_TFM_NEED_KEY
COMPAT have_GENHD_FL_NO_PART
COMPAT have_SHASH_DESC_ON_STACK
COMPAT have_WB_congested_enum
COMPAT have_allow_kernel_signal
COMPAT have_bdev_discard_granularity
COMPAT have_bdev_max_discard_sectors
COMPAT have_bdev_nr_sectors
COMPAT have_bdevname
COMPAT have_bdgrab
COMPAT have_bdi_congested
COMPAT have_bdi_congested_fn
COMPAT have_bio_alloc_clone
COMPAT have_bio_bi_bdev
COMPAT have_bio_bi_error
COMPAT have_bio_bi_opf
COMPAT have_bio_bi_status
COMPAT have_bio_clone_fast
COMPAT have_bio_op_shift
COMPAT have_bio_set_dev
COMPAT have_bio_set_op_attrs
COMPAT have_bio_split_to_limits
COMPAT have_bio_start_io_acct
COMPAT have_bioset_init
COMPAT have_bioset_need_bvecs
COMPAT have_blk_alloc_disk
COMPAT have_blk_alloc_queue_rh
COMPAT have_blk_check_plugged
COMPAT have_blk_cleanup_disk
COMPAT have_blk_qc_t_make_request
COMPAT have_blk_qc_t_submit_bio
COMPAT have_blk_queue_flag_set
COMPAT have_blk_queue_make_request
COMPAT have_blk_queue_max_write_same_sectors
COMPAT have_blk_queue_merge_bvec
COMPAT have_blk_queue_split_bio
COMPAT have_blk_queue_split_q_bio
COMPAT have_blk_queue_split_q_bio_bioset
COMPAT have_blk_queue_update_readahead
COMPAT have_blk_queue_write_cache
COMPAT have_bvec_kmap_local
COMPAT have_d_inode
COMPAT have_disk_update_readahead
COMPAT have_fallthrough
COMPAT have_fs_dax_get_by_bdev
COMPAT have_generic_start_io_acct_q_rw_sect_part
COMPAT have_generic_start_io_acct_rw_sect_part
COMPAT have_get_random_u32
COMPAT have_get_random_u32_below
COMPAT have_hd_struct
COMPAT have_ib_cq_init_attr
COMPAT have_ib_get_dma_mr
COMPAT have_idr_is_empty
COMPAT have_inode_lock
COMPAT have_ktime_to_timespec64
COMPAT have_kvfree
COMPAT have_kvfree_rcu
COMPAT have_list_is_first
COMPAT have_list_next_entry
COMPAT have_max_send_recv_sge
COMPAT have_nla_nest_start_noflag
COMPAT have_nla_parse_deprecated
COMPAT have_nla_put_64bit
COMPAT have_nla_strscpy
COMPAT have_part_stat_h
COMPAT have_part_stat_read_accum
COMPAT have_pointer_backing_dev_info
COMPAT have_proc_create_single
COMPAT have_queue_flag_discard
COMPAT have_queue_flag_stable_writes
COMPAT have_rb_declare_callbacks_max
COMPAT have_refcount_inc
COMPAT have_req_hardbarrier
COMPAT have_req_noidle
COMPAT have_req_nounmap
COMPAT have_req_op_write
COMPAT have_req_op_write_zeroes
COMPAT have_req_write
COMPAT have_revalidate_disk_size
COMPAT have_sched_set_fifo
COMPAT have_sched_signal_h
COMPAT have_security_netlink_recv
COMPAT have_sendpage_ok
COMPAT have_set_capacity_and_notify
COMPAT have_shash_desc_zero
COMPAT have_simple_positive
COMPAT have_sock_set_keepalive
COMPAT have_strscpy
COMPAT have_struct_bvec_iter
COMPAT have_struct_size
COMPAT have_submit_bio_noacct
COMPAT have_tcp_sock_set_cork
COMPAT have_tcp_sock_set_keepcnt
COMPAT have_tcp_sock_set_keepidle
COMPAT have_tcp_sock_set_nodelay
COMPAT have_tcp_sock_set_quickack
COMPAT have_time64_to_tm
COMPAT have_timer_setup
COMPAT have_void_make_request
COMPAT have_void_submit_bio
COMPAT ib_alloc_pd_has_2_params
COMPAT ib_device_has_ops
COMPAT ib_post_send_const_params
COMPAT ib_query_device_has_3_params
COMPAT need_drbd_wrappers
COMPAT need_make_request_recursion
COMPAT need_skb_abort_seq_read
COMPAT part_stat_read_takes_block_device
COMPAT queue_limits_has_discard_zeroes_data
COMPAT rdma_create_id_has_net_ns
COMPAT rdma_reject_has_reason_arg
COMPAT sk_data_ready_has_1_param
COMPAT sock_create_kern_has_netns_parameter
COMPAT sock_ops_returns_addr_len
COMPAT struct_gendisk_has_backing_dev_info
UPD /tmp/pkg/drbd-9.2.3/drbd/compat.5.15.99.h
UPD /tmp/pkg/drbd-9.2.3/drbd/compat.h
make[4]: 'drbd-kernel-compat/cocci_cache/618a16740b5c8d49f5fd464218ac2850/compat.patch' is up to date.
PATCH
patching file ./drbd_int.h
patching file drbd_transport_tcp.c
patching file drbd_state.c
patching file drbd_req.c
patching file drbd_receiver.c
patching file drbd_nl.c
patching file drbd_main.c
patching file drbd_debugfs.c
patching file drbd_dax_pmem.c
patching file drbd_bitmap.c
patching file drbd_actlog.c
CC [M] /tmp/pkg/drbd-9.2.3/drbd/drbd_dax_pmem.o
CC [M] /tmp/pkg/drbd-9.2.3/drbd/drbd_debugfs.o
CC [M] /tmp/pkg/drbd-9.2.3/drbd/drbd_bitmap.o
CC [M] /tmp/pkg/drbd-9.2.3/drbd/drbd_proc.o
CC [M] /tmp/pkg/drbd-9.2.3/drbd/drbd_sender.o
CC [M] /tmp/pkg/drbd-9.2.3/drbd/drbd_receiver.o
CC [M] /tmp/pkg/drbd-9.2.3/drbd/drbd_req.o
CC [M] /tmp/pkg/drbd-9.2.3/drbd/drbd_actlog.o
CC [M] /tmp/pkg/drbd-9.2.3/drbd/drbd_main.o
CC [M] /tmp/pkg/drbd-9.2.3/drbd/drbd_strings.o
CC [M] /tmp/pkg/drbd-9.2.3/drbd/drbd_nl.o
CC [M] /tmp/pkg/drbd-9.2.3/drbd/drbd_interval.o
CC [M] /tmp/pkg/drbd-9.2.3/drbd/drbd_state.o
CC [M] /tmp/pkg/drbd-9.2.3/drbd/drbd-kernel-compat/drbd_wrappers.o
CC [M] /tmp/pkg/drbd-9.2.3/drbd/drbd_nla.o
CC [M] /tmp/pkg/drbd-9.2.3/drbd/drbd_transport.o
GEN /tmp/pkg/drbd-9.2.3/drbd/drbd_buildtag.c
CC [M] /tmp/pkg/drbd-9.2.3/drbd/drbd_buildtag.o
LD [M] /tmp/pkg/drbd-9.2.3/drbd/drbd.o
CC [M] /tmp/pkg/drbd-9.2.3/drbd/drbd_transport_tcp.o
CC [M] /tmp/pkg/drbd-9.2.3/drbd/drbd_transport_rdma.o
MODPOST /tmp/pkg/drbd-9.2.3/drbd/Module.symvers
CC [M] /tmp/pkg/drbd-9.2.3/drbd/drbd.mod.o
LD [M] /tmp/pkg/drbd-9.2.3/drbd/drbd.ko
BTF [M] /tmp/pkg/drbd-9.2.3/drbd/drbd.ko
Skipping BTF generation for /tmp/pkg/drbd-9.2.3/drbd/drbd.ko due to unavailability of vmlinux
CC [M] /tmp/pkg/drbd-9.2.3/drbd/drbd_transport_rdma.mod.o
LD [M] /tmp/pkg/drbd-9.2.3/drbd/drbd_transport_rdma.ko
Skipping BTF generation for /tmp/pkg/drbd-9.2.3/drbd/drbd_transport_rdma.ko due to unavailability of vmlinux
BTF [M] /tmp/pkg/drbd-9.2.3/drbd/drbd_transport_rdma.ko
CC [M] /tmp/pkg/drbd-9.2.3/drbd/drbd_transport_tcp.mod.o
LD [M] /tmp/pkg/drbd-9.2.3/drbd/drbd_transport_tcp.ko
BTF [M] /tmp/pkg/drbd-9.2.3/drbd/drbd_transport_tcp.ko
Skipping BTF generation for /tmp/pkg/drbd-9.2.3/drbd/drbd_transport_tcp.ko due to unavailability of vmlinux
mv .drbd_kernelrelease.new .drbd_kernelrelease
Memorizing module configuration ... done.
make[1]: Leaving directory '/tmp/pkg/drbd-9.2.3/drbd'
Module build was successful.
=======================================================================
With DRBD module version 8.4.5, we split out the management tools
into their own repository at https://github.com/LINBIT/drbd-utils
(tarball at http://links.linbit.com/drbd-download)
That started out as "drbd-utils version 8.9.0",
has a different release cycle,
and provides compatible drbdadm, drbdsetup and drbdmeta tools
for DRBD module versions 8.3, 8.4 and 9.
Again: to manage DRBD 9 kernel modules and above,
you want drbd-utils >= 9.3 from above url.
=======================================================================
DRBD version loaded:
version: 9.2.3 (api:2/proto:86-122)
GIT-hash: c142ca1280c41aee1330b980544ef276330ff6ef build by @node1, 2023-08-10 06:25:01
Transports (api:18): tcp (9.2.3) rdma (9.2.3)
root@node1:~# kubectl -n piraeus-datastore get pods
NAME READY STATUS RESTARTS AGE
ha-controller-hkws4 1/1 Running 1 (14m ago) 70m
ha-controller-nd5p2 1/1 Running 12 (7m20s ago) 21m
ha-controller-trbvh 1/1 Running 1 (13m ago) 70m
linstor-controller-97cd7495c-k6kzb 1/1 Running 1 (14m ago) 70m
linstor-csi-controller-7f85967cd9-z7c56 7/7 Running 8 (14m ago) 61m
linstor-csi-node-78hz4 3/3 Running 3 (14m ago) 70m
linstor-csi-node-9dx8d 3/3 Running 3 (13m ago) 70m
linstor-csi-node-wcdgp 3/3 Running 6 (15m ago) 70m
node1 2/2 Running 0 2m48s
node2 2/2 Running 2 (14m ago) 70m
node3 2/2 Running 2 (13m ago) 70m
piraeus-datastore-controller-manager-6f6b8f48c4-lnzpp 2/2 Running 2 (13m ago) 70m
There is a verison of DRBD that is included in your OS. But we need a newer version, which is what the drbd-module-loader container is for.
But if something during the boot sequence already loaded the old version of DRBD, we will not automatically unload it to build the newer version. Most of the time it is the drbd.service
that is causing the load, but you may want to check /etc/modules.d/*
for anything that wants to load the old DRBD version.
thanks,i find i installed drbd-utils in node1
root@node1:~# apt list --installed |grep drbd
WARNING: apt does not have a stable CLI interface. Use with caution in scripts.
drbd-utils/jammy,now 9.15.0-1build2 amd64 [installed]
node2 and node3 have no drbd-utils and service
root@node2:~# systemctl status drbd
Unit drbd.service could not be found.
and i remove it
apt remove -y drbd-utils
thress nodes
reboot node1
two pods CrashLoopBackOff and never return to normal
pod node1 logs
pod ha-controller logs