longhorn / longhorn

Cloud-Native distributed storage built on and for Kubernetes
https://longhorn.io
Apache License 2.0
6.18k stars 604 forks source link

[QUESTION] Longhorn changes ctime of the files in volume after restart if SELinux is enabled #4157

Closed derekbit closed 2 years ago

derekbit commented 2 years ago

Question

Discussed in https://github.com/longhorn/longhorn/discussions/4130

Hi,

I am facing an issue that ctime of the files in longhorn volume changes every time I start the pod.

stat command output:

  File: 'configuration.xml'
  Size: 16544           Blocks: 48         IO Block: 4096   regular file
Device: 810h/2064d      Inode: 12          Links: 1
Access: (0777/-rwxrwxrwx)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2022-06-17 15:29:56.000000000 +0800
Modify: 2022-06-15 17:08:49.000000000 +0800
Change: 2022-06-17 15:43:17.000000000 +0800
 Birth: -

Any hints?

Thanks.

Environment

one node cluster created by RKE2 on Redhat 8.4 SELinux enabled Longhorn is installed by default, the pvc is attached to /dev/longhorn/pvc-x--x-x-x-x-x-x-x--xxx

Additional context

https://github.com/longhorn/longhorn/discussions/4130

derekbit commented 2 years ago

The issue is due to the SELinux and confirmed in the discussion https://github.com/longhorn/longhorn/discussions/4130. This ticket is to investigate if there is a possible solution to the ctime change after pod restart from Longhorn side.

jwenjian commented 2 years ago

Additional info, found a lot of audit message in /var/log/messages:

Jun 21 15:27:30 HOSTNAME setroubleshoot[5928]: SELinux is preventing /csi-attacher from write access on the sock_file csi.sock. For complete SELinux messages run: sealert -l 3f84fe8c-0d59-4021-9990-297531242b7d
Jun 21 15:27:30 HOSTNAME setroubleshoot[5928]: SELinux is preventing /csi-attacher from write access on the sock_file csi.sock.#012#012*****  Plugin catchall_labels (83.8 confidence) suggests   *******************#012#012If you want to allow csi-attacher to have write access on the csi.sock sock_file#012Then you need to change the label on csi.sock#012Do#012# semanage fcontext -a -t FILE_TYPE 'csi.sock'#012where FILE_TYPE is one of the following: NetworkManager_var_run_t, abrt_var_run_t, aiccu_var_run_t, ajaxterm_var_run_t, alsa_var_run_t, antivirus_var_run_t, apcupsd_var_run_t, apmd_var_run_t, arpwatch_var_run_t, asterisk_var_run_t, audisp_var_run_t, auditd_var_run_t, automount_var_run_t, avahi_var_run_t, bacula_var_run_t, bcfg2_var_run_t, bitlbee_var_run_t, blkmapd_var_run_t, blktap_var_run_t, blueman_var_run_t, bluetooth_var_run_t, boltd_var_run_t, bootloader_var_run_t, brltty_var_run_t, bumblebee_var_run_t, cachefilesd_var_run_t, callweaver_var_run_t, canna_var_run_t, cardmgr_var_run_t, ccs_var_run_t, certmaster_var_run_t, certmonger_var_run_t, cgdcbxd_var_run_t, cgred_var_run_t, chronyd_var_run_t, cinder_var_run_t, clogd_var_run_t, cluster_var_run_t, clvmd_var_run_t, cmirrord_var_run_t, cockpit_var_run_t, collectd_var_run_t, comsat_var_run_t, condor_var_run_t, conman_var_run_t, conntrackd_var_run_t, consolekit_var_run_t, container_file_t, container_kvm_var_run_t, container_plugin_var_run_t, container_var_run_t, couchdb_var_run_t, courier_var_run_t, cpuplug_var_run_t, cpuspeed_var_run_t, cron_var_run_t, crond_var_run_t, ctdbd_var_run_t, cupsd_config_var_run_t, cupsd_lpd_var_run_t, cupsd_var_run_t, cvs_var_run_t, cyphesis_var_run_t, cyrus_var_run_t, dbskkd_var_run_t, dcc_var_run_t, dccd_var_run_t, dccifd_var_run_t, dccm_var_run_t, dcerpcd_var_run_t, ddclient_var_run_t, deltacloudd_var_run_t, devicekit_var_run_t, devlog_t, dhcpc_var_run_t, dhcpd_var_run_t, dictd_var_run_t, dirsrv_snmp_var_run_t, dirsrv_var_run_t, dkim_milter_data_t, dlm_controld_var_run_t, dnsmasq_var_run_t, dnssec_trigger_var_run_t, dovecot_var_run_t, drbd_var_run_t, dspam_var_run_t, entropyd_var_run_t, eventlogd_var_run_t, evtchnd_var_run_t, exim_var_run_t, fail2ban_var_run_t, fcoemon_var_run_t, fenced_var_run_t, fetchmail_var_run_t, fingerd_var_run_t, firewalld_var_run_t, foghorn_var_run_t, freeipmi_bmc_watchdog_var_run_t, freeipmi_ipmidetectd_var_run_t, freeipmi_ipmiseld_var_run_t, fsadm_var_run_t, fsdaemon_var_run_t, ftpd_var_run_t, fusefs_t, games_srv_var_run_t, gdomap_var_run_t, getty_var_run_t, gfs_controld_var_run_t, glance_var_run_t, glusterd_var_run_t, gpm_var_run_t, gpsd_var_run_t, greylist_milter_data_t, groupd_var_run_t, gssproxy_var_lib_t, gssproxy_var_run_t, haproxy_var_run_t, hostapd_var_run_t, httpd_var_run_t, hwloc_var_run_t, ibacm_var_run_t, icecast_var_run_t, ifconfig_var_run_t, inetd_child_var_run_t, inetd_var_run_t, init_var_run_t, initrc_var_run_t, innd_var_run_t, ipmievd_var_run_t, ipsec_mgmt_var_run_t, ipsec_var_run_t, iptables_var_lib_t, iptables_var_run_t, irqbalance_var_run_t, iscsi_var_run_t, isnsd_var_run_t, iwhd_var_run_t, jetty_var_run_t, kadmind_var_run_t, keepalived_var_run_t, keystone_var_run_t, kismet_var_run_t, klogd_var_run_t, kmod_var_run_t, krb5kdc_var_run_t, ksmtuned_var_run_t, l2tpd_var_run_t, lircd_var_run_t, lldpad_var_run_t, locate_var_run_t, logwatch_var_run_t, lpd_var_run_t, lsassd_var_run_t, lsmd_var_run_t, lttng_sessiond_var_run_t, lvm_var_run_t, lwiod_var_run_t, lwregd_var_run_t, lwsmd_var_run_t, mailman_var_run_t, mcelog_var_run_t, mdadm_var_run_t, memcached_var_run_t, minidlna_var_run_t, minissdpd_var_run_t, mirrormanager_var_run_t, mock_var_run_t, mon_statd_var_run_t, mongod_var_run_t, motion_var_run_t, mount_var_run_t, mpd_var_run_t, mrtg_var_run_t, mscan_var_run_t, munin_var_run_t, mysqld_var_run_t, mysqlmanagerd_var_run_t, naemon_var_run_t, nagios_var_run_t, named_var_run_t, netlogond_var_run_t, neutron_var_run_t, nfs_t, ninfod_run_t, nmbd_var_run_t, nova_var_run_t, nrpe_var_run_t, nscd_var_run_t, nsd_var_run_t, nslcd_var_run_t, ntop_var_run_t, ntpd_var_run_t, numad_var_run_t, nut_var_run_t, nx_server_var_run_t, oddjob_var_run_t, onload_fs_t, opafm_var_run_t, openct_var_run_t, opendnssec_var_run_t, openhpid_var_run_t, openshift_var_run_t, openvpn_var_run_t, openvswitch_var_run_t, openwsman_run_t, osad_var_run_t, pads_var_run_t, pam_var_console_t, pam_var_run_t, passenger_var_run_t, pcp_var_run_t, pcscd_var_run_t, pdns_var_run_t, pegasus_openlmi_storage_var_run_t, pegasus_var_run_t, pesign_var_run_t, piranha_fos_var_run_t, piranha_lvs_var_run_t, piranha_pulse_var_run_t, piranha_web_var_run_t, pkcs11proxyd_var_run_t, pkcs_slotd_var_run_t, pki_ra_var_run_t, pki_tomcat_var_run_t, pki_tps_var_run_t, plymouthd_var_run_t, policykit_var_run_t, polipo_pid_t, portmap_var_run_t, portreserve_var_run_t, postfix_var_run_t, postgresql_var_run_t, postgrey_var_run_t, pppd_var_run_t, pptp_var_run_t, prelude_audisp_var_run_t, prelude_lml_var_run_t, prelude_var_run_t, privoxy_var_run_t, prosody_var_run_t, psad_var_run_t, ptal_var_run_t, pulseaudio_var_run_t, puppet_var_run_t, pwauth_var_run_t, pyicqt_var_run_t, qdiskd_var_run_t, qemu_var_run_t, qpidd_var_run_t, quota_nld_var_run_t, rabbitmq_var_run_t, radiusd_var_run_t, radvd_var_run_t, readahead_var_run_t, redis_var_run_t, regex_milter_data_t, restorecond_var_run_t, rhev_agentd_var_run_t, rhnsd_var_run_t, rhsmcertd_var_run_t, ricci_modcluster_var_run_t, ricci_var_run_t, rlogind_var_run_t, rngd_var_run_t, roundup_var_run_t, rpcbind_var_run_t, rpcd_var_run_t, rpm_var_run_t, rrdcached_var_run_t, rsync_var_run_t, rtas_errd_var_run_t, sanlock_var_run_t, saslauthd_var_run_t, sbd_var_run_t, sblim_var_run_t, screen_var_run_t, sendmail_var_run_t, sensord_var_run_t, setrans_var_run_t, setroubleshoot_var_run_t, slapd_var_run_t, slpd_var_run_t, smbd_var_run_t, smokeping_var_run_t, smsd_var_run_t, snmpd_var_run_t, snort_var_run_t, sosreport_var_run_t, soundd_var_run_t, spamass_milter_data_t, spamd_var_run_t, spc_var_run_t, squid_var_run_t, srvsvcd_var_run_t, sshd_var_run_t, sslh_var_run_t, sssd_public_t, sssd_var_lib_t, sssd_var_run_t, stapserver_var_run_t, stratisd_var_run_t, stunnel_var_run_t, svirt_home_t, svnserve_var_run_t, swat_var_run_t, swift_var_run_t, syslogd_var_run_t, system_cronjob_var_run_t, system_dbusd_var_run_t, systemd_bootchart_var_run_t, systemd_importd_var_run_t, systemd_logind_inhibit_var_run_t, systemd_logind_sessions_t, systemd_logind_var_run_t, systemd_machined_var_run_t, systemd_networkd_var_run_t, systemd_passwd_var_run_t, systemd_resolved_var_run_t, systemd_timedated_var_run_t, tangd_cache_t, telnetd_var_run_t, tftpd_var_run_t, tgtd_var_run_t, thin_aeolus_configserver_var_run_t, thin_var_run_t, timemaster_var_run_t, tlp_var_run_t, tomcat_var_run_t, tor_var_run_t, tuned_var_run_t, udev_var_run_t, uml_switch_var_run_t, usbmuxd_var_run_t, useradd_var_run_t, uucpd_var_run_t, uuidd_var_run_t, var_run_t, varnishd_var_run_t, varnishlog_var_run_t, vdagent_var_run_t, vhostmd_var_run_t, virt_lxc_var_run_t, virt_qemu_ga_var_run_t, virt_var_run_t, virtlogd_var_run_t, vmware_host_pid_t, vmware_pid_t, vnstatd_var_run_t, vpnc_var_run_t, watchdog_var_run_t, wdmd_var_run_t, winbind_var_run_t, xdm_var_run_t, xenconsoled_var_run_t, xend_var_run_t, xenstored_var_run_t, xserver_var_run_t, ypbind_var_run_t, yppasswdd_var_run_t, ypserv_var_run_t, ypxfr_var_run_t, zabbix_var_run_t, zarafa_deliver_var_run_t, zarafa_gateway_var_run_t, zarafa_ical_var_run_t, zarafa_indexer_var_run_t, zarafa_monitor_var_run_t, zarafa_server_var_run_t, zarafa_spooler_var_run_t, zebra_var_run_t, zoneminder_var_run_t.#012Then execute:#012restorecon -v 'csi.sock'#012#012#012*****  Plugin catchall (17.1 confidence) suggests   **************************#012#012If you believe that csi-attacher should be allowed write access on the csi.sock sock_file by default.#012Then you should report this as a bug.#012You can generate a local policy module to allow this access.#012Do#012allow this access for now by executing:#012# ausearch -c 'csi-attacher' --raw | audit2allow -M my-csiattacher#012# semodule -X 300 -i my-csiattacher.pp#012
derekbit commented 2 years ago

@jwenjian I cannot reproduce the issue in my env with selinux enforced mode. Could you provide the steps or manifests?

jwenjian commented 2 years ago

Sorry I cant provide the manifest, hope above info helps.

derekbit commented 2 years ago

@jwenjian Thanks for your information. I'm still unable to reproduce it. But, this looks related to the application pod's securityContext or SELinux stuff rather than the settings of Longhorn.

My current thought is that if something wrong with the SELinux context or SELinux context is missing while starting the application pod, so recursive SELinux relabeling is triggered.

jwenjian commented 2 years ago

Hi, further information: setting SELinux mode to permissive can still re-produce this issue, only setting SELinux mode to disabled can fix

Any hints on this topic to debug? more vms having this issue now in our env...

jwenjian commented 2 years ago

This workload also have a init container which mount the same Longhorn volume

Noticed that, the ctime of the file already changed in init container, it's just a busybox image.

derekbit commented 2 years ago

@jwenjian I cannot reproduce this issue on my side. I would suggest you provide a simple setup/configuration and pod/volume manifests for helping us identifying the issue.

BTW, I notice your host is Redhat 8.4. Does it also happen on other OS such as ubuntu?

jwenjian commented 2 years ago

After further investigation, it's more like caused by RKE2 / containerd working with SELinux enabled, see https://github.com/containerd/containerd/discussions/7178 for further discussion, thanks for your support.