LINBIT / drbd

LINBIT DRBD kernel module
https://docs.linbit.com/docs/users-guide-9.0/
GNU General Public License v2.0
574 stars 97 forks source link

modules dependency lost after injection #29

Closed ydcool closed 2 years ago

ydcool commented 2 years ago

After inject drbd kernel module, the contents in modules.dep was lost:

[root@kylinos ~]# cat /lib/modules/$(uname -r)/modules.dep
updates/drbd_transport_tcp.ko: updates/drbd.ko
updates/drbd.ko:

OS info:

[root@kylinos ~]# cat /etc/os-release 
NAME="Kylin Linux Advanced Server"
VERSION="V10 (Sword)"
ID="kylin"
VERSION_ID="V10"
PRETTY_NAME="Kylin Linux Advanced Server V10 (Sword)"
ANSI_COLOR="0;31"

Kernel version:

[root@kylinos ~]# uname -a 
Linux kylinos 4.19.90-25.9.v2101.ky10.aarch64 #1 SMP Wed Dec 1 17:24:28 CST 2021 aarch64 aarch64 aarch64 GNU/Linux

How to reproduce:

[root@kylinos ~]#  docker run  -v /sys:/sys -v /dev:/dev  -v /usr/src:/usr/src:ro  -v /lib/modules:/lib/modules   -e LB_HOW=compile  -e LB_INSTALL=yes  --privileged  --rm  -i piraeusdatastore/drbd9-bionic:v9.1.4

Injection logs:

``` [root@kylinos ~]# docker run -v /sys:/sys -v /dev:/dev -v /usr/src:/usr/src:ro -v /lib/modules:/lib/modules -e LB_HOW=compile -e LB_INSTALL=yes --privileged --rm -i piraeusdatastore/drbd9-bionic:v9.1.4 Need a git checkout to regenerate drbd/.drbd_git_revision make[1]: Entering directory '/tmp/pkg/drbd-9.1.4/drbd' Calling toplevel makefile of kernel source tree, which I believe is in KDIR=/lib/modules/4.19.90-25.9.v2101.ky10.aarch64/build make -C /lib/modules/4.19.90-25.9.v2101.ky10.aarch64/build M=/tmp/pkg/drbd-9.1.4/drbd modules COMPAT __vmalloc_has_2_params COMPAT alloc_workqueue_takes_fmt COMPAT before_4_13_kernel_read COMPAT blkdev_issue_zeroout_discard COMPAT can_include_vermagic_h COMPAT genl_policy_in_ops COMPAT have_BIO_MAX_VECS COMPAT have_CRYPTO_TFM_NEED_KEY COMPAT have_SHASH_DESC_ON_STACK COMPAT have_WB_congested_enum COMPAT have_allow_kernel_signal COMPAT have_bdi_cap_stable_writes COMPAT have_bdi_congested_fn COMPAT have_bio_bi_bdev COMPAT have_bio_bi_error COMPAT have_bio_bi_opf COMPAT have_bio_bi_status COMPAT have_bio_clone_fast COMPAT have_bio_op_shift COMPAT have_bio_set_dev COMPAT have_bio_set_op_attrs COMPAT have_bio_start_io_acct COMPAT have_bioset_init COMPAT have_bioset_need_bvecs COMPAT have_blk_alloc_queue_rh COMPAT have_blk_check_plugged COMPAT have_blk_qc_t_make_request COMPAT have_blk_queue_flag_set COMPAT have_blk_queue_make_request COMPAT have_blk_queue_merge_bvec COMPAT have_blk_queue_plugged COMPAT have_blk_queue_split_bio COMPAT have_blk_queue_split_q_bio COMPAT have_blk_queue_split_q_bio_bioset COMPAT have_blk_queue_update_readahead COMPAT have_blk_queue_write_cache COMPAT have_d_inode COMPAT have_fallthrough COMPAT have_generic_start_io_acct_q_rw_sect_part COMPAT have_generic_start_io_acct_rw_sect_part COMPAT have_genl_family_parallel_ops COMPAT have_hd_struct COMPAT have_ib_cq_init_attr COMPAT have_ib_get_dma_mr COMPAT have_idr_is_empty COMPAT have_inode_lock COMPAT have_ktime_to_timespec64 COMPAT have_kvfree COMPAT have_max_send_recv_sge COMPAT have_nla_nest_start_noflag COMPAT have_nla_parse_deprecated COMPAT have_nla_put_64bit COMPAT have_nla_strscpy COMPAT have_part_stat_h COMPAT have_part_stat_read_accum COMPAT have_pointer_backing_dev_info COMPAT have_proc_create_single COMPAT have_queue_flag_stable_writes COMPAT have_rb_declare_callbacks_max COMPAT have_refcount_inc COMPAT have_req_flush COMPAT have_req_hardbarrier COMPAT have_req_noidle COMPAT have_req_nounmap COMPAT have_req_op_write COMPAT have_req_op_write_same COMPAT have_req_op_write_zeroes COMPAT have_req_prio COMPAT have_req_write COMPAT have_req_write_same COMPAT have_revalidate_disk_size COMPAT have_sched_set_fifo COMPAT have_security_netlink_recv COMPAT have_sendpage_ok COMPAT have_set_capacity_and_notify COMPAT have_shash_desc_zero COMPAT have_simple_positive COMPAT have_sock_set_keepalive COMPAT have_struct_bvec_iter COMPAT have_struct_kernel_param_ops COMPAT have_struct_size COMPAT have_submit_bio COMPAT have_submit_bio_noacct COMPAT have_tcp_sock_set_cork COMPAT have_tcp_sock_set_nodelay COMPAT have_tcp_sock_set_quickack COMPAT have_time64_to_tm COMPAT have_timer_setup COMPAT have_void_make_request COMPAT ib_alloc_pd_has_2_params COMPAT ib_device_has_ops COMPAT ib_post_send_const_params COMPAT ib_query_device_has_3_params COMPAT need_make_request_recursion COMPAT part_stat_read_takes_block_device COMPAT queue_limits_has_discard_zeroes_data COMPAT rdma_create_id_has_net_ns COMPAT sock_create_kern_has_five_parameters COMPAT sock_ops_returns_addr_len UPD /tmp/pkg/drbd-9.1.4/drbd/compat.4.19.90-25.9.v2101.ky10.aarch64.h UPD /tmp/pkg/drbd-9.1.4/drbd/compat.h make[4]: 'drbd-kernel-compat/cocci_cache/0227ebbe9035d69a25a6e7bee7eef61a/compat.patch' is up to date. PATCH patching file ./drbd_int.h patching file drbd-headers/linux/genl_magic_struct.h patching file drbd_receiver.c patching file drbd_main.c patching file drbd_nla.c patching file drbd_nl.c patching file drbd_transport_tcp.c patching file drbd_bitmap.c patching file drbd_interval.c patching file drbd_debugfs.c patching file drbd_req.c patching file drbd_state.c patching file drbd_sender.c patching file drbd-headers/linux/genl_magic_func.h Hunk #2 succeeded at 312 (offset -20 lines). CC [M] /tmp/pkg/drbd-9.1.4/drbd/drbd_debugfs.o CC [M] /tmp/pkg/drbd-9.1.4/drbd/drbd_bitmap.o CC [M] /tmp/pkg/drbd-9.1.4/drbd/drbd_proc.o CC [M] /tmp/pkg/drbd-9.1.4/drbd/drbd_sender.o CC [M] /tmp/pkg/drbd-9.1.4/drbd/drbd_receiver.o CC [M] /tmp/pkg/drbd-9.1.4/drbd/drbd_req.o CC [M] /tmp/pkg/drbd-9.1.4/drbd/drbd_actlog.o CC [M] /tmp/pkg/drbd-9.1.4/drbd/lru_cache.o CC [M] /tmp/pkg/drbd-9.1.4/drbd/drbd_main.o CC [M] /tmp/pkg/drbd-9.1.4/drbd/drbd_strings.o CC [M] /tmp/pkg/drbd-9.1.4/drbd/drbd_nl.o CC [M] /tmp/pkg/drbd-9.1.4/drbd/drbd_interval.o CC [M] /tmp/pkg/drbd-9.1.4/drbd/drbd_state.o CC [M] /tmp/pkg/drbd-9.1.4/drbd/drbd-kernel-compat/drbd_wrappers.o CC [M] /tmp/pkg/drbd-9.1.4/drbd/drbd_nla.o CC [M] /tmp/pkg/drbd-9.1.4/drbd/drbd_transport.o GEN /tmp/pkg/drbd-9.1.4/drbd/drbd_buildtag.c CC [M] /tmp/pkg/drbd-9.1.4/drbd/drbd_buildtag.o LD [M] /tmp/pkg/drbd-9.1.4/drbd/drbd.o CC [M] /tmp/pkg/drbd-9.1.4/drbd/drbd_transport_tcp.o Building modules, stage 2. MODPOST 2 modules CC /tmp/pkg/drbd-9.1.4/drbd/drbd.mod.o LD [M] /tmp/pkg/drbd-9.1.4/drbd/drbd.ko CC /tmp/pkg/drbd-9.1.4/drbd/drbd_transport_tcp.mod.o LD [M] /tmp/pkg/drbd-9.1.4/drbd/drbd_transport_tcp.ko mv .drbd_kernelrelease.new .drbd_kernelrelease Memorizing module configuration ... done. make[1]: Leaving directory '/tmp/pkg/drbd-9.1.4/drbd' Module build was successful. ======================================================================= With DRBD module version 8.4.5, we split out the management tools into their own repository at https://github.com/LINBIT/drbd-utils (tarball at http://links.linbit.com/drbd-download) That started out as "drbd-utils version 8.9.0", has a different release cycle, and provides compatible drbdadm, drbdsetup and drbdmeta tools for DRBD module versions 8.3, 8.4 and 9. Again: to manage DRBD 9 kernel modules and above, you want drbd-utils >= 9.3 from above url. ======================================================================= make -C drbd install make[1]: Entering directory '/tmp/pkg/drbd-9.1.4/drbd' install -d //lib/modules/4.19.90-25.9.v2101.ky10.aarch64/updates set -e ; for ko in drbd.ko drbd_transport_tcp.ko; do \ install -m 644 $ko //lib/modules/4.19.90-25.9.v2101.ky10.aarch64/updates; \ done /sbin/depmod -a || : make[1]: Leaving directory '/tmp/pkg/drbd-9.1.4/drbd' DRBD version loaded: version: 9.1.4 (api:2/proto:110-121) GIT-hash: e4de25c3a65811b0fa4733b1c2a000ee322f5cfa build by @c8d9156d50f7, 2022-02-28 01:25:35 Transports (api:17): tcp (9.1.4) ```
rck commented 2 years ago

does it get restored when you run sudo depmod -a after the install on the host again?

ydcool commented 2 years ago

@rck no, I have to backup the dep file and restore it after injection, and then run depmod -a, finally works.

    local depFile="/lib/modules/$(uname -r)/modules.dep"
    local backFile="${depFile}.bakcup.$(date +%Y%m%d)"
    cp $depFile $backFile

    docker run \
        -v /sys:/sys \
        -v /dev:/dev \
        -v /usr/src:/usr/src:ro \
        -v /lib/modules:/lib/modules \
        -e LB_HOW=compile \
        -e LB_INSTALL=yes \
        --privileged \
        --rm \
        -i "${DrbdInjectImage}"

    if [ $? -ne 0 ]; then
        echo "drbd kernel module inject failed!"
        exit 1
    fi

    # if modules.dep contents lost, we'll recover with backup file
    if [ $(grep -v drbd /lib/modules/$(uname -r)/modules.dep | wc -l) -eq 0 ]; then
        echo "[WARN] modules.dep contents lost, we'll recover it with backup file..."
        while read line; do
            if ! grep -q "$line" $backFile; then
                echo "$line" >>$backFile
            fi
        done <$depFile
        install -b -p $backFile $depFile
        depmod -a
    fi
rck commented 2 years ago

hm, weird, I have to look into this, this certainly should be fixed. FWIW and not knowing your exact use-case, the module injector mainly makes sense on container distributions with read-only file systems. If you have a normal distribution, make && make install from the tarball should work and would be easier.

ydcool commented 2 years ago

Our use-case is to setup drbd on various managed hosts, like a new created k8s cluster, while without spend time on setting up the build environment for different OS and arches, so this container injection is a light-wight solution for me.

ydcool commented 2 years ago

And this issue only occurred on aarch64 Kylin OS for now, I'm not sure if that has something to do with that vendor, and I know very little about what changes the vendor has made to the CentOS liked os ;-(

rck commented 2 years ago

I really can not point my finger at it, but IIRC I once had to work with that distribution and wanted to run away because it felt like a very bad clone where people don't know what they are actually copying around. Kernel macros? It was bad. Anyways, thanks for looking into it, but for now I consider it a bug in that distribution. If it shows up somewhere else, feel free to re-open that issue and I'm happy to debug it, for now I will close it.