dynup / kpatch

kpatch - live kernel patching
GNU General Public License v2.0
1.46k stars 300 forks source link

Panic on loading module for /proc/meminfo example on vanilla 4.1 #497

Closed vincentbernat closed 8 years ago

vincentbernat commented 9 years ago

Hello!

Still on the same vanilla 4.1 kernel, I get this when loading the patch with kpatch load:

loading core module: /root/src/kpatch/kpatch/../kmod/core/kpatch.ko
loading patch module: kpatch-meminfo-string.ko
BUG: unable to handle kernel paging request at ffffffffa0010cc0
IP: [<ffffffff8125ecb0>] do_init_module+0x84/0x1af
PGD 13d3067 PUD 13d4063 PMD 1e1ee067 PTE 1e1a0161
Oops: 0003 [#1]
Modules linked in: kpatch_meminfo_string(O+) kpatch(O)
CPU: 0 PID: 149 Comm: insmod Tainted: G           O  K 4.1.0+ #1
task: ffff88001e17b810 ti: ffff88001e1cc000 task.ti: ffff88001e1cc000
RIP: 0010:[<ffffffff8125ecb0>]  [<ffffffff8125ecb0>] do_init_module+0x84/0x1af
RSP: 0018:ffff88001e1cfda8  EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffffffffa0010cc0 RCX: 0000000080a02001
RDX: 0000000000000024 RSI: 0000000000000000 RDI: ffffffff813fabe0
RBP: 0000000000000000 R08: 0000000000000000 R09: 00000000d0000000
R10: ffffffffa000e000 R11: 0000000000000001 R12: ffff88001eb58638
R13: ffffffffa0010d10 R14: 0000000000000001 R15: 0000000000000000
FS:  00007f0ae00aa700(0000) GS:ffffffff813e1000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: ffffffffa0010cc0 CR3: 000000001e181000 CR4: 00000000000006b0
Stack:
 ffff88001e1cfed8 0000000000000001 ffffffffa0010cc0 ffffffff81058aac
 ffff88001e207680 00000000810a462f ffffc90000096890 0000000000000e00
 ffffffff00000016 ffffffff8126cd40 ffff88001eaa6a08 ffff88001e1cfe48
Call Trace:
 [<ffffffff81058aac>] ? load_module+0x18ad/0x18e9
 [<ffffffff81056290>] ? copy_module_from_fd+0x86/0xdf
 [<ffffffff81058c1e>] ? SyS_finit_module+0x56/0x61
 [<ffffffff81261854>] ? system_call_fastpath+0x12/0x6a
Code: f8 00 00 00 74 23 49 c7 c0 80 ca 26 81 48 8d 53 18 89 c1 4c 89 c6 48 c7 c7 6d ef 36 81 31 c0 e8 16 fb ff ff e8 18 06 00 00 31 f6 <c7> 03 00 00 00 00 48 89 da 48 c7 c7 c0 c9 3f 81 e8 7e b3 dd ff
RIP  [<ffffffff8125ecb0>] do_init_module+0x84/0x1af
 RSP <ffff88001e1cfda8>
CR2: ffffffffa0010cc0
---[ end trace 559a193e6db7735e ]---

I have tried to debug a bit, but I have no clue on how to load the symbols from the module while I can't get .text and .data from sysfs. Is there another way to get the address the .text section was loaded?

If I casually trace with gdb, the kernel doesn't panic when initializing the module per-se, but a bit latter:

(gdb) b do_init_module
Note: breakpoints 1 and 2 also set at pc 0xffffffff8125ec2c.
Breakpoint 3 at 0xffffffff8125ec2c: file kernel/module.c, line 3056.
(gdb) continue
Continuing.

Breakpoint 1, do_init_module (mod=0xffffffffa0001cc0) at kernel/module.c:3056
3056    {
(gdb) continue
Continuing.

Breakpoint 1, do_init_module (mod=0xffffffffa0010cc0) at kernel/module.c:3056
3056    {
(gdb) n
3060            freeinit = kmalloc(sizeof(*freeinit), GFP_KERNEL);
(gdb)
3056    {
(gdb)
3060            freeinit = kmalloc(sizeof(*freeinit), GFP_KERNEL);
(gdb)
3061            if (!freeinit) {
(gdb)
3060            freeinit = kmalloc(sizeof(*freeinit), GFP_KERNEL);
(gdb)
3061            if (!freeinit) {
(gdb)
3065            freeinit->module_init = mod->module_init;
(gdb)
3071            current->flags &= ~PF_USED_ASYNC;
(gdb)
3075            if (mod->init != NULL)
(gdb)
3076                    ret = do_one_initcall(mod->init);
(gdb) print mod->init
$3 = (int (*)(void)) 0xffffffffa0013000
(gdb) print *mod
$4 = {
  state = MODULE_STATE_COMING,
  list = {
    next = 0xffffffffa0001cc8,
    prev = 0xffffffff813fc9f0 <modules>
  },
  name = "kpatch_meminfo_string", '\000' <repeats 34 times>,
  mkobj = {
    kobj = {
      name = 0xffff88001e183a20 "kpatch_meminfo_string",
      entry = {
        next = 0xffff880000151cb0,
        prev = 0xffffffffa0001d18
      },
      parent = 0xffff880000151cc0,
      kset = 0xffff880000151cb0,
      ktype = 0xffffffff813f76c0 <module_ktype>,
      sd = 0xffff88001e183f80,
      kref = {
        refcount = {
          counter = 3
        }
      },
      state_initialized = 1,
      state_in_sysfs = 1,
      state_add_uevent_sent = 1,
      state_remove_uevent_sent = 0,
      uevent_suppress = 0
    },
    mod = 0xffffffffa0010cc0,
    drivers_dir = 0x0,
    mp = 0x0,
    kobj_completion = 0x0
  },
  modinfo_attrs = 0xffff88001e1ea668,
  version = 0x0,
  srcversion = 0x0,
  holders_dir = 0xffff88001e9aef98,
  syms = 0x0,
  crcs = 0x0,
  num_syms = 0,
  kp = 0x0,
  num_kp = 0,
  num_gpl_syms = 0,
  gpl_syms = 0x0,
  gpl_crcs = 0x0,
  gpl_future_syms = 0x0,
  gpl_future_crcs = 0x0,
  num_gpl_future_syms = 0,
  num_exentries = 0,
  extable = 0x0,
  init = 0xffffffffa0013000,
  module_init = 0xffffffffa0013000,
  module_core = 0xffffffffa0010000,
  init_size = 3181,
  core_size = 4790,
  init_text_size = 687,
  core_text_size = 1602,
  init_ro_size = 687,
  core_ro_size = 3153,
  arch = {<No data fields>},
  taints = 4096,
  symtab = 0xffffffffa00132b0,
  core_symtab = 0xffffffffa0010ef0,
  num_symtab = 76,
  core_num_syms = 23,
  strtab = 0xffffffffa00139d0 "",
  core_strtab = 0xffffffffa0011118 "",
  sect_attrs = 0xffff88001e21e838,
  notes_attrs = 0xffff88001e01be78,
  args = 0xffff88001e183248 "",
  num_tracepoints = 0,
  tracepoints_ptrs = 0x0,
  num_trace_bprintk_fmt = 0,
  trace_bprintk_fmt_start = 0x0,
  trace_events = 0x0,
  num_trace_events = 0,
  trace_enums = 0x0,
  num_trace_enums = 0,
  num_ftrace_callsites = 1,
  ftrace_callsites = 0xffffffffa0010c28,
  klp_alive = true,
  source_list = {
    next = 0xffffffffa0010eb0,
    prev = 0xffffffffa0010eb0
  },
  target_list = {
    next = 0xffffffffa0010ec0,
    prev = 0xffffffffa0010ec0
  },
  exit = 0xffffffffa0010215,
  refcnt = {
    counter = 2
  }
}
(gdb) n
3077            if (ret < 0) {
(gdb)
3076                    ret = do_one_initcall(mod->init);
(gdb)
3077            if (ret < 0) {
(gdb)
3080            if (ret > 0) {
(gdb)
3090            blocking_notifier_call_chain(&module_notify_list,
(gdb)
3089            mod->state = MODULE_STATE_LIVE;
(gdb) n
Remote connection closed
jpoimboe commented 9 years ago

This is a weird one. I'm not able to recreate.

I have tried to debug a bit, but I have no clue on how to load the symbols from the module while I can't get .text and .data from sysfs. Is there another way to get the address the .text section was loaded?

You can do something like:

cat /sys/module/kpatch/sections/{.text,.data,.bss}
0xffffffffa02f8000
0xffffffffa02fa000
0xffffffffa02fa300
add-symbol-file kpatch.ko 0xffffffffa02f8000 -s .data 0xffffffffa02fa000 -s .bss 0xffffffffa02fa300
vincentbernat commented 9 years ago

❦ 17 août 2015 15:04 -0700, Josh Poimboeuf notifications@github.com :

This is a weird one. I'm not able to recreate.

I have tried to debug a bit, but I have no clue on how to load the
symbols from the module while I can't get .text and .data from
sysfs. Is there another way to get the address the .text section
was loaded?

You can do something like:

cat /sys/module/kpatch/sections/{.text,.data,.bss} 0xffffffffa02f8000 0xffffffffa02fa000 0xffffffffa02fa300 add-symbol-file kpatch.ko 0xffffffffa02f8000 -s .data 0xffffffffa02fa000 -s .bss 0xffffffffa02fa300

I have no problem with the kpatch module. It's the other one that makes the panic. And as soon as I load it, the kernel panics, so I can't look at the content of /sys/module/kpatch-meminfo-string/sections... The best I can do is access to the appropriate struct module * in gdb.

I am using gcc 4.9.3 from Debian:

Using built-in specs.
COLLECT_GCC=/usr/bin/gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/4.9/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian 4.9.3-3' --with-bugurl=file:///usr/share/doc/gcc-4.9/README.Bugs --enable-languages=c,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-4.9 --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.9 --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-gnu-unique-object --disable-vtable-verify --enable-plugin --with-system-zlib --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-4.9-amd64/jre --enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-4.9-amd64 --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-4.9-amd64 --with-arch-directory=amd64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --enable-objc-gc --enable-multiarch --with-arch-32=i586 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 4.9.3 (Debian 4.9.3-3)

My .config is quite small:

CONFIG_64BIT=y
CONFIG_X86_64=y
CONFIG_X86=y
CONFIG_INSTRUCTION_DECODER=y
CONFIG_PERF_EVENTS_INTEL_UNCORE=y
CONFIG_OUTPUT_FORMAT="elf64-x86-64"
CONFIG_ARCH_DEFCONFIG="arch/x86/configs/x86_64_defconfig"
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_HAVE_LATENCYTOP_SUPPORT=y
CONFIG_MMU=y
CONFIG_NEED_DMA_MAP_STATE=y
CONFIG_NEED_SG_DMA_LENGTH=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_ARCH_HAS_CPU_RELAX=y
CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y
CONFIG_HAVE_SETUP_PER_CPU_AREA=y
CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK=y
CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK=y
CONFIG_ARCH_HIBERNATION_POSSIBLE=y
CONFIG_ARCH_SUSPEND_POSSIBLE=y
CONFIG_ARCH_WANT_HUGE_PMD_SHARE=y
CONFIG_ARCH_WANT_GENERAL_HUGETLB=y
CONFIG_ZONE_DMA32=y
CONFIG_AUDIT_ARCH=y
CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING=y
CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
CONFIG_ARCH_HWEIGHT_CFLAGS="-fcall-saved-rdi -fcall-saved-rsi -fcall-saved-rdx -fcall-saved-rcx -fcall-saved-r8 -fcall-saved-r9 -fcall-saved-r10 -fcall-saved-r11"
CONFIG_ARCH_SUPPORTS_UPROBES=y
CONFIG_FIX_EARLYCON_MEM=y
CONFIG_PGTABLE_LEVELS=4
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"
CONFIG_IRQ_WORK=y
CONFIG_BUILDTIME_EXTABLE_SORT=y
CONFIG_BROKEN_ON_SMP=y
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_CROSS_COMPILE=""
CONFIG_LOCALVERSION=""
CONFIG_HAVE_KERNEL_GZIP=y
CONFIG_HAVE_KERNEL_BZIP2=y
CONFIG_HAVE_KERNEL_LZMA=y
CONFIG_HAVE_KERNEL_XZ=y
CONFIG_HAVE_KERNEL_LZO=y
CONFIG_HAVE_KERNEL_LZ4=y
CONFIG_KERNEL_XZ=y
CONFIG_DEFAULT_HOSTNAME="(none)"
CONFIG_SWAP=y
CONFIG_USELIB=y
CONFIG_HAVE_ARCH_AUDITSYSCALL=y
CONFIG_GENERIC_IRQ_PROBE=y
CONFIG_GENERIC_IRQ_SHOW=y
CONFIG_GENERIC_IRQ_LEGACY_ALLOC_HWIRQ=y
CONFIG_IRQ_DOMAIN=y
CONFIG_GENERIC_MSI_IRQ=y
CONFIG_IRQ_FORCED_THREADING=y
CONFIG_SPARSE_IRQ=y
CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_ARCH_CLOCKSOURCE_DATA=y
CONFIG_CLOCKSOURCE_VALIDATE_LAST_CYCLE=y
CONFIG_GENERIC_TIME_VSYSCALL=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
CONFIG_GENERIC_CLOCKEVENTS_MIN_ADJUST=y
CONFIG_GENERIC_CMOS_UPDATE=y
CONFIG_HZ_PERIODIC=y
CONFIG_TICK_CPU_ACCOUNTING=y
CONFIG_TINY_RCU=y
CONFIG_SRCU=y
CONFIG_RCU_KTHREAD_PRIO=0
CONFIG_LOG_BUF_SHIFT=17
CONFIG_HAVE_UNSTABLE_SCHED_CLOCK=y
CONFIG_ARCH_SUPPORTS_NUMA_BALANCING=y
CONFIG_ARCH_SUPPORTS_INT128=y
CONFIG_BLK_DEV_INITRD=y
CONFIG_INITRAMFS_SOURCE=""
CONFIG_RD_GZIP=y
CONFIG_RD_BZIP2=y
CONFIG_RD_LZMA=y
CONFIG_RD_XZ=y
CONFIG_RD_LZO=y
CONFIG_RD_LZ4=y
CONFIG_CC_OPTIMIZE_FOR_SIZE=y
CONFIG_SYSCTL=y
CONFIG_ANON_INODES=y
CONFIG_SYSCTL_EXCEPTION_TRACE=y
CONFIG_HAVE_PCSPKR_PLATFORM=y
CONFIG_BPF=y
CONFIG_EXPERT=y
CONFIG_MULTIUSER=y
CONFIG_SGETMASK_SYSCALL=y
CONFIG_SYSFS_SYSCALL=y
CONFIG_KALLSYMS=y
CONFIG_KALLSYMS_ALL=y
CONFIG_PRINTK=y
CONFIG_SHMEM=y
CONFIG_ADVISE_SYSCALLS=y
CONFIG_PCI_QUIRKS=y
CONFIG_EMBEDDED=y
CONFIG_HAVE_PERF_EVENTS=y
CONFIG_PERF_EVENTS=y
CONFIG_SLOB=y
CONFIG_TRACEPOINTS=y
CONFIG_HAVE_OPROFILE=y
CONFIG_OPROFILE_NMI_TIMER=y
CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=y
CONFIG_ARCH_USE_BUILTIN_BSWAP=y
CONFIG_HAVE_IOREMAP_PROT=y
CONFIG_HAVE_KPROBES=y
CONFIG_HAVE_KRETPROBES=y
CONFIG_HAVE_OPTPROBES=y
CONFIG_HAVE_KPROBES_ON_FTRACE=y
CONFIG_HAVE_ARCH_TRACEHOOK=y
CONFIG_HAVE_DMA_ATTRS=y
CONFIG_HAVE_DMA_CONTIGUOUS=y
CONFIG_GENERIC_SMP_IDLE_THREAD=y
CONFIG_HAVE_REGS_AND_STACK_ACCESS_API=y
CONFIG_HAVE_DMA_API_DEBUG=y
CONFIG_HAVE_HW_BREAKPOINT=y
CONFIG_HAVE_MIXED_BREAKPOINTS_REGS=y
CONFIG_HAVE_USER_RETURN_NOTIFIER=y
CONFIG_HAVE_PERF_EVENTS_NMI=y
CONFIG_HAVE_PERF_REGS=y
CONFIG_HAVE_PERF_USER_STACK_DUMP=y
CONFIG_HAVE_ARCH_JUMP_LABEL=y
CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG=y
CONFIG_HAVE_CMPXCHG_LOCAL=y
CONFIG_HAVE_CMPXCHG_DOUBLE=y
CONFIG_HAVE_ARCH_SECCOMP_FILTER=y
CONFIG_HAVE_CC_STACKPROTECTOR=y
CONFIG_CC_STACKPROTECTOR_NONE=y
CONFIG_HAVE_CONTEXT_TRACKING=y
CONFIG_HAVE_VIRT_CPU_ACCOUNTING_GEN=y
CONFIG_HAVE_IRQ_TIME_ACCOUNTING=y
CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE=y
CONFIG_HAVE_ARCH_HUGE_VMAP=y
CONFIG_HAVE_ARCH_SOFT_DIRTY=y
CONFIG_MODULES_USE_ELF_RELA=y
CONFIG_HAVE_IRQ_EXIT_ON_IRQ_STACK=y
CONFIG_ARCH_HAS_ELF_RANDOMIZE=y
CONFIG_ARCH_HAS_GCOV_PROFILE_ALL=y
CONFIG_BASE_SMALL=1
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
CONFIG_BLOCK=y
CONFIG_BLK_DEV_BSG=y
CONFIG_MSDOS_PARTITION=y
CONFIG_EFI_PARTITION=y
CONFIG_IOSCHED_NOOP=y
CONFIG_IOSCHED_DEADLINE=y
CONFIG_IOSCHED_CFQ=y
CONFIG_DEFAULT_CFQ=y
CONFIG_DEFAULT_IOSCHED="cfq"
CONFIG_INLINE_SPIN_UNLOCK_IRQ=y
CONFIG_INLINE_READ_UNLOCK=y
CONFIG_INLINE_READ_UNLOCK_IRQ=y
CONFIG_INLINE_WRITE_UNLOCK=y
CONFIG_INLINE_WRITE_UNLOCK_IRQ=y
CONFIG_ARCH_SUPPORTS_ATOMIC_RMW=y
CONFIG_ARCH_USE_QUEUE_RWLOCK=y
CONFIG_X86_FEATURE_NAMES=y
CONFIG_X86_MPPARSE=y
CONFIG_NO_BOOTMEM=y
CONFIG_GENERIC_CPU=y
CONFIG_X86_INTERNODE_CACHE_SHIFT=6
CONFIG_X86_L1_CACHE_SHIFT=6
CONFIG_X86_TSC=y
CONFIG_X86_CMPXCHG64=y
CONFIG_X86_CMOV=y
CONFIG_X86_MINIMUM_CPU_FAMILY=64
CONFIG_X86_DEBUGCTLMSR=y
CONFIG_CPU_SUP_INTEL=y
CONFIG_CPU_SUP_AMD=y
CONFIG_CPU_SUP_CENTAUR=y
CONFIG_HPET_TIMER=y
CONFIG_SWIOTLB=y
CONFIG_IOMMU_HELPER=y
CONFIG_NR_CPUS=1
CONFIG_PREEMPT_NONE=y
CONFIG_UP_LATE_INIT=y
CONFIG_X86_LOCAL_APIC=y
CONFIG_X86_IO_APIC=y
CONFIG_X86_16BIT=y
CONFIG_X86_ESPFIX64=y
CONFIG_X86_VSYSCALL_EMULATION=y
CONFIG_ARCH_PHYS_ADDR_T_64BIT=y
CONFIG_ARCH_DMA_ADDR_T_64BIT=y
CONFIG_X86_DIRECT_GBPAGES=y
CONFIG_ARCH_SPARSEMEM_ENABLE=y
CONFIG_ARCH_SPARSEMEM_DEFAULT=y
CONFIG_ARCH_SELECT_MEMORY_MODEL=y
CONFIG_ILLEGAL_POINTER_VALUE=0xdead000000000000
CONFIG_SELECT_MEMORY_MODEL=y
CONFIG_SPARSEMEM_MANUAL=y
CONFIG_SPARSEMEM=y
CONFIG_HAVE_MEMORY_PRESENT=y
CONFIG_SPARSEMEM_EXTREME=y
CONFIG_SPARSEMEM_VMEMMAP_ENABLE=y
CONFIG_SPARSEMEM_ALLOC_MEM_MAP_TOGETHER=y
CONFIG_SPARSEMEM_VMEMMAP=y
CONFIG_HAVE_MEMBLOCK=y
CONFIG_HAVE_MEMBLOCK_NODE_MAP=y
CONFIG_ARCH_DISCARD_MEMBLOCK=y
CONFIG_PAGEFLAGS_EXTENDED=y
CONFIG_SPLIT_PTLOCK_CPUS=4
CONFIG_ARCH_ENABLE_SPLIT_PMD_PTLOCK=y
CONFIG_PHYS_ADDR_T_64BIT=y
CONFIG_ZONE_DMA_FLAG=0
CONFIG_VIRT_TO_BUS=y
CONFIG_DEFAULT_MMAP_MIN_ADDR=4096
CONFIG_NEED_PER_CPU_KM=y
CONFIG_GENERIC_EARLY_IOREMAP=y
CONFIG_X86_RESERVE_LOW=64
CONFIG_HZ_250=y
CONFIG_HZ=250
CONFIG_PHYSICAL_START=0x1000000
CONFIG_PHYSICAL_ALIGN=0x1000000
CONFIG_HAVE_LIVEPATCH=y
CONFIG_LIVEPATCH=y
CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG=y
CONFIG_ACPI=y
CONFIG_ACPI_LEGACY_TABLES_LOOKUP=y
CONFIG_ARCH_MIGHT_HAVE_ACPI_PDC=y
CONFIG_ACPI_SYSTEM_POWER_STATES_SUPPORT=y
CONFIG_ACPI_AC=y
CONFIG_ACPI_BATTERY=y
CONFIG_ACPI_BUTTON=y
CONFIG_ACPI_FAN=y
CONFIG_ACPI_PROCESSOR=y
CONFIG_ACPI_THERMAL=y
CONFIG_ACPI_CUSTOM_DSDT_FILE=""
CONFIG_X86_PM_TIMER=y
CONFIG_ACPI_HOTPLUG_IOAPIC=y
CONFIG_HAVE_ACPI_APEI=y
CONFIG_HAVE_ACPI_APEI_NMI=y
CONFIG_CPU_IDLE=y
CONFIG_CPU_IDLE_GOV_LADDER=y
CONFIG_CPU_IDLE_GOV_MENU=y
CONFIG_PCI=y
CONFIG_PCI_DIRECT=y
CONFIG_PCI_MMCONFIG=y
CONFIG_PCI_DOMAINS=y
CONFIG_PCI_MSI=y
CONFIG_HT_IRQ=y
CONFIG_PCI_LABEL=y
CONFIG_ISA_DMA_API=y
CONFIG_AMD_NB=y
CONFIG_BINFMT_ELF=y
CONFIG_BINFMT_SCRIPT=y
CONFIG_BINFMT_MISC=y
CONFIG_X86_DEV_DMA_OPS=y
CONFIG_PMC_ATOM=y
CONFIG_NET=y
CONFIG_UNIX=y
CONFIG_XFRM=y
CONFIG_INET=y
CONFIG_IP_PNP=y
CONFIG_IP_PNP_DHCP=y
CONFIG_NET_IP_TUNNEL=y
CONFIG_INET_TUNNEL=y
CONFIG_INET_XFRM_MODE_TRANSPORT=y
CONFIG_INET_XFRM_MODE_TUNNEL=y
CONFIG_INET_XFRM_MODE_BEET=y
CONFIG_INET_LRO=y
CONFIG_INET_DIAG=y
CONFIG_INET_TCP_DIAG=y
CONFIG_TCP_CONG_CUBIC=y
CONFIG_DEFAULT_TCP_CONG="cubic"
CONFIG_IPV6=y
CONFIG_INET6_XFRM_MODE_TRANSPORT=y
CONFIG_INET6_XFRM_MODE_TUNNEL=y
CONFIG_INET6_XFRM_MODE_BEET=y
CONFIG_IPV6_SIT=y
CONFIG_IPV6_NDISC_NODETYPE=y
CONFIG_HAVE_NET_DSA=y
CONFIG_NET_RX_BUSY_POLL=y
CONFIG_BQL=y
CONFIG_WIRELESS=y
CONFIG_NET_9P=y
CONFIG_NET_9P_VIRTIO=y
CONFIG_HAVE_BPF_JIT=y
CONFIG_UEVENT_HELPER=y
CONFIG_UEVENT_HELPER_PATH=""
CONFIG_DEVTMPFS=y
CONFIG_ALLOW_DEV_COREDUMP=y
CONFIG_GENERIC_CPU_AUTOPROBE=y
CONFIG_ARCH_MIGHT_HAVE_PC_PARPORT=y
CONFIG_PNP=y
CONFIG_PNP_DEBUG_MESSAGES=y
CONFIG_PNPACPI=y
CONFIG_BLK_DEV=y
CONFIG_VIRTIO_BLK=y
CONFIG_HAVE_IDE=y
CONFIG_SCSI_MOD=y
CONFIG_NETDEVICES=y
CONFIG_NET_CORE=y
CONFIG_VETH=y
CONFIG_VIRTIO_NET=m
CONFIG_ETHERNET=y
CONFIG_NET_VENDOR_3COM=y
CONFIG_NET_VENDOR_ADAPTEC=y
CONFIG_NET_VENDOR_AGERE=y
CONFIG_NET_VENDOR_ALTEON=y
CONFIG_NET_VENDOR_AMD=y
CONFIG_NET_VENDOR_ARC=y
CONFIG_NET_VENDOR_ATHEROS=y
CONFIG_NET_CADENCE=y
CONFIG_NET_VENDOR_BROADCOM=y
CONFIG_NET_VENDOR_BROCADE=y
CONFIG_NET_VENDOR_CHELSIO=y
CONFIG_NET_VENDOR_CISCO=y
CONFIG_NET_VENDOR_DEC=y
CONFIG_NET_VENDOR_DLINK=y
CONFIG_NET_VENDOR_EMULEX=y
CONFIG_NET_VENDOR_EXAR=y
CONFIG_NET_VENDOR_HP=y
CONFIG_NET_VENDOR_INTEL=y
CONFIG_NET_VENDOR_I825XX=y
CONFIG_NET_VENDOR_MARVELL=y
CONFIG_NET_VENDOR_MELLANOX=y
CONFIG_NET_VENDOR_MICREL=y
CONFIG_NET_VENDOR_MYRI=y
CONFIG_NET_VENDOR_NATSEMI=y
CONFIG_NET_VENDOR_8390=y
CONFIG_NET_VENDOR_NVIDIA=y
CONFIG_NET_VENDOR_OKI=y
CONFIG_NET_PACKET_ENGINE=y
CONFIG_NET_VENDOR_QLOGIC=y
CONFIG_NET_VENDOR_QUALCOMM=y
CONFIG_NET_VENDOR_REALTEK=y
CONFIG_NET_VENDOR_RDC=y
CONFIG_NET_VENDOR_ROCKER=y
CONFIG_NET_VENDOR_SAMSUNG=y
CONFIG_NET_VENDOR_SEEQ=y
CONFIG_NET_VENDOR_SILAN=y
CONFIG_NET_VENDOR_SIS=y
CONFIG_NET_VENDOR_SMSC=y
CONFIG_NET_VENDOR_STMICRO=y
CONFIG_NET_VENDOR_SUN=y
CONFIG_NET_VENDOR_TEHUTI=y
CONFIG_NET_VENDOR_TI=y
CONFIG_NET_VENDOR_VIA=y
CONFIG_NET_VENDOR_WIZNET=y
CONFIG_WLAN=y
CONFIG_INPUT=y
CONFIG_INPUT_MOUSEDEV=y
CONFIG_INPUT_MOUSEDEV_PSAUX=y
CONFIG_INPUT_MOUSEDEV_SCREEN_X=1024
CONFIG_INPUT_MOUSEDEV_SCREEN_Y=768
CONFIG_INPUT_KEYBOARD=y
CONFIG_KEYBOARD_ATKBD=y
CONFIG_INPUT_MOUSE=y
CONFIG_MOUSE_PS2=y
CONFIG_MOUSE_PS2_ALPS=y
CONFIG_MOUSE_PS2_LOGIPS2PP=y
CONFIG_MOUSE_PS2_SYNAPTICS=y
CONFIG_MOUSE_PS2_CYPRESS=y
CONFIG_MOUSE_PS2_TRACKPOINT=y
CONFIG_MOUSE_PS2_FOCALTECH=y
CONFIG_SERIO=y
CONFIG_ARCH_MIGHT_HAVE_PC_SERIO=y
CONFIG_SERIO_I8042=y
CONFIG_SERIO_SERPORT=y
CONFIG_SERIO_LIBPS2=y
CONFIG_TTY=y
CONFIG_VT=y
CONFIG_CONSOLE_TRANSLATIONS=y
CONFIG_VT_CONSOLE=y
CONFIG_HW_CONSOLE=y
CONFIG_UNIX98_PTYS=y
CONFIG_LEGACY_PTYS=y
CONFIG_LEGACY_PTY_COUNT=256
CONFIG_DEVMEM=y
CONFIG_SERIAL_EARLYCON=y
CONFIG_SERIAL_8250=y
CONFIG_SERIAL_8250_DEPRECATED_OPTIONS=y
CONFIG_SERIAL_8250_PNP=y
CONFIG_SERIAL_8250_CONSOLE=y
CONFIG_SERIAL_8250_PCI=y
CONFIG_SERIAL_8250_NR_UARTS=4
CONFIG_SERIAL_8250_RUNTIME_UARTS=4
CONFIG_SERIAL_CORE=y
CONFIG_SERIAL_CORE_CONSOLE=y
CONFIG_HVC_DRIVER=y
CONFIG_VIRTIO_CONSOLE=y
CONFIG_DEVPORT=y
CONFIG_ARCH_WANT_OPTIONAL_GPIOLIB=y
CONFIG_POWER_SUPPLY=y
CONFIG_THERMAL=y
CONFIG_THERMAL_DEFAULT_GOV_STEP_WISE=y
CONFIG_THERMAL_GOV_STEP_WISE=y
CONFIG_SSB_POSSIBLE=y
CONFIG_BCMA_POSSIBLE=y
CONFIG_VGA_ARB=y
CONFIG_VGA_ARB_MAX_GPUS=16
CONFIG_VGA_CONSOLE=y
CONFIG_DUMMY_CONSOLE=y
CONFIG_DUMMY_CONSOLE_COLUMNS=80
CONFIG_DUMMY_CONSOLE_ROWS=25
CONFIG_HID=y
CONFIG_HID_GENERIC=y
CONFIG_USB_OHCI_LITTLE_ENDIAN=y
CONFIG_RTC_LIB=y
CONFIG_VIRTIO=y
CONFIG_VIRTIO_PCI=y
CONFIG_VIRTIO_PCI_LEGACY=y
CONFIG_CLKEVT_I8253=y
CONFIG_CLKBLD_I8253=y
CONFIG_DCACHE_WORD_ACCESS=y
CONFIG_FS_POSIX_ACL=y
CONFIG_OVERLAY_FS=y
CONFIG_PROC_FS=y
CONFIG_PROC_SYSCTL=y
CONFIG_PROC_PAGE_MONITOR=y
CONFIG_KERNFS=y
CONFIG_SYSFS=y
CONFIG_TMPFS=y
CONFIG_TMPFS_POSIX_ACL=y
CONFIG_TMPFS_XATTR=y
CONFIG_NETWORK_FILESYSTEMS=y
CONFIG_9P_FS=y
CONFIG_9P_FS_SECURITY=y
CONFIG_NLS=y
CONFIG_NLS_DEFAULT="iso8859-1"
CONFIG_TRACE_IRQFLAGS_SUPPORT=y
CONFIG_MESSAGE_LOGLEVEL_DEFAULT=4
CONFIG_DEBUG_INFO=y
CONFIG_GDB_SCRIPTS=y
CONFIG_FRAME_WARN=1024
CONFIG_DEBUG_FS=y
CONFIG_ARCH_WANT_FRAME_POINTERS=y
CONFIG_DEBUG_KERNEL=y
CONFIG_HAVE_DEBUG_KMEMLEAK=y
CONFIG_HAVE_DEBUG_STACKOVERFLOW=y
CONFIG_HAVE_ARCH_KMEMCHECK=y
CONFIG_HAVE_ARCH_KASAN=y
CONFIG_KASAN_SHADOW_OFFSET=0xdffffc0000000000
CONFIG_PANIC_ON_OOPS_VALUE=0
CONFIG_PANIC_TIMEOUT=0
CONFIG_SCHED_DEBUG=y
CONFIG_STACKTRACE=y
CONFIG_ARCH_HAS_DEBUG_STRICT_USER_COPY_CHECKS=y
CONFIG_USER_STACKTRACE_SUPPORT=y
CONFIG_NOP_TRACER=y
CONFIG_HAVE_FUNCTION_TRACER=y
CONFIG_HAVE_FUNCTION_GRAPH_TRACER=y
CONFIG_HAVE_FUNCTION_GRAPH_FP_TEST=y
CONFIG_HAVE_DYNAMIC_FTRACE=y
CONFIG_HAVE_DYNAMIC_FTRACE_WITH_REGS=y
CONFIG_HAVE_FTRACE_MCOUNT_RECORD=y
CONFIG_HAVE_SYSCALL_TRACEPOINTS=y
CONFIG_HAVE_FENTRY=y
CONFIG_HAVE_C_RECORDMCOUNT=y
CONFIG_TRACE_CLOCK=y
CONFIG_RING_BUFFER=y
CONFIG_EVENT_TRACING=y
CONFIG_CONTEXT_SWITCH_TRACER=y
CONFIG_TRACING=y
CONFIG_GENERIC_TRACER=y
CONFIG_TRACING_SUPPORT=y
CONFIG_FTRACE=y
CONFIG_FUNCTION_TRACER=y
CONFIG_FUNCTION_GRAPH_TRACER=y
CONFIG_BRANCH_PROFILE_NONE=y
CONFIG_DYNAMIC_FTRACE=y
CONFIG_DYNAMIC_FTRACE_WITH_REGS=y
CONFIG_FTRACE_MCOUNT_RECORD=y
CONFIG_HAVE_ARCH_KGDB=y
CONFIG_EARLY_PRINTK=y
CONFIG_DOUBLEFAULT=y
CONFIG_HAVE_MMIOTRACE_SUPPORT=y
CONFIG_IO_DELAY_TYPE_0X80=0
CONFIG_IO_DELAY_TYPE_0XED=1
CONFIG_IO_DELAY_TYPE_UDELAY=2
CONFIG_IO_DELAY_TYPE_NONE=3
CONFIG_IO_DELAY_0X80=y
CONFIG_DEFAULT_IO_DELAY_TYPE=0
CONFIG_OPTIMIZE_INLINING=y
CONFIG_DEFAULT_SECURITY_DAC=y
CONFIG_DEFAULT_SECURITY=""
CONFIG_CRYPTO=y
CONFIG_CRYPTO_ALGAPI=y
CONFIG_CRYPTO_ALGAPI2=y
CONFIG_CRYPTO_AEAD2=y
CONFIG_CRYPTO_BLKCIPHER=y
CONFIG_CRYPTO_BLKCIPHER2=y
CONFIG_CRYPTO_HASH=y
CONFIG_CRYPTO_HASH2=y
CONFIG_CRYPTO_RNG=y
CONFIG_CRYPTO_RNG2=y
CONFIG_CRYPTO_PCOMP2=y
CONFIG_CRYPTO_MANAGER=y
CONFIG_CRYPTO_MANAGER2=y
CONFIG_CRYPTO_MANAGER_DISABLE_TESTS=y
CONFIG_CRYPTO_NULL=y
CONFIG_CRYPTO_WORKQUEUE=y
CONFIG_CRYPTO_HMAC=y
CONFIG_CRYPTO_SHA256=y
CONFIG_CRYPTO_AES=y
CONFIG_CRYPTO_ANSI_CPRNG=y
CONFIG_CRYPTO_HW=y
CONFIG_HAVE_KVM=y
CONFIG_VIRTUALIZATION=y
CONFIG_BINARY_PRINTF=y
CONFIG_BITREVERSE=y
CONFIG_RATIONAL=y
CONFIG_GENERIC_STRNCPY_FROM_USER=y
CONFIG_GENERIC_STRNLEN_USER=y
CONFIG_GENERIC_NET_UTILS=y
CONFIG_GENERIC_FIND_FIRST_BIT=y
CONFIG_GENERIC_PCI_IOMAP=y
CONFIG_GENERIC_IOMAP=y
CONFIG_GENERIC_IO=y
CONFIG_ARCH_USE_CMPXCHG_LOCKREF=y
CONFIG_ARCH_HAS_FAST_MULTIPLIER=y
CONFIG_CRC32=y
CONFIG_CRC32_SLICEBY8=y
CONFIG_ZLIB_INFLATE=y
CONFIG_LZO_DECOMPRESS=y
CONFIG_LZ4_DECOMPRESS=y
CONFIG_XZ_DEC=y
CONFIG_XZ_DEC_X86=y
CONFIG_XZ_DEC_POWERPC=y
CONFIG_XZ_DEC_IA64=y
CONFIG_XZ_DEC_ARM=y
CONFIG_XZ_DEC_ARMTHUMB=y
CONFIG_XZ_DEC_SPARC=y
CONFIG_XZ_DEC_BCJ=y
CONFIG_DECOMPRESS_GZIP=y
CONFIG_DECOMPRESS_BZIP2=y
CONFIG_DECOMPRESS_LZMA=y
CONFIG_DECOMPRESS_XZ=y
CONFIG_DECOMPRESS_LZO=y
CONFIG_DECOMPRESS_LZ4=y
CONFIG_HAS_IOMEM=y
CONFIG_HAS_IOPORT_MAP=y
CONFIG_HAS_DMA=y
CONFIG_DQL=y
CONFIG_NLATTR=y
CONFIG_ARCH_HAS_ATOMIC64_DEC_IF_POSITIVE=y
CONFIG_AVERAGE=y
CONFIG_ARCH_HAS_SG_CHAIN=y

Habit is habit, and not to be flung out of the window by any man, but coaxed down-stairs a step at a time. -- Mark Twain, "Pudd'nhead Wilson's Calendar

jpoimboe commented 8 years ago

Looking at the the oops, it appears the faulting instruction was:

c7 03 00 00 00 00       movl   $0x0,(%rbx)

which corresponds to:

mod->state = MODULE_STATE_LIVE;

The invalid mod pointer (rbx) was 0xffffffffa0010cc0.

The surrounding code was:

        if (mod->init != NULL)
                ret = do_one_initcall(mod->init);
        if (ret < 0) {
                goto fail_free_freeinit;
        }
        if (ret > 0) {
                pr_warn("%s: '%s'->init suspiciously returned %d, it should "
                        "follow 0/-E convention\n"
                        "%s: loading module anyway...\n",
                        __func__, mod->name, ret, __func__);
                dump_stack();   
        }

        /* Now it's a first class citizen! */
        mod->state = MODULE_STATE_LIVE;

It's very odd that the mod pointer (0xffffffffa0010cc0) seemed to be valid for the earlier mod->init dereference, but then it was invalid for the mod->state dereference failed. The pointer seems valid, and matches the value from the gdb dump. It almost seems like the call to do_one_initcall() somehow resulted in the mod pointer getting freed.

@vincentbernat are you still able to recreate?

jpoimboe commented 8 years ago

There's one more thing that doesn't make sense to me, from looking at the assembly listing:

        call    do_one_initcall
.LVL1029:
        .loc 1 3077 0
        testl   %eax, %eax
        .loc 1 3076 0
        movl    %eax, %r12d
.LVL1030:
        .loc 1 3077 0
        js      .L1111
        .loc 1 3080 0
        je      .L1110
        .loc 1 3081 0
        movq    $__func__.40217, %r8
        leaq    24(%rbx), %rdx
.LVL1031:
        movl    %eax, %ecx
        movq    %r8, %rsi
        movq    $.LC139, %rdi
        xorl    %eax, %eax
.LVL1032:
        call    printk
.LVL1033:
        .loc 1 3085 0
        call    dump_stack
.LVL1034:
.L1110:
        .loc 1 3090 0
        xorl    %esi, %esi
        .loc 1 3089 0
        movl    $0, (%rbx)

After the call to do_one_initcall(), it does a movl %eax, %r12d, which should put 0 in the lower 32 bits of r12 (which represents the ret variable). Then it jumps to .L1110. But the oops dump shows r12's value as R12: ffff88001eb58638. Which doesn't make any sense.

jpoimboe commented 8 years ago

(that assumes my assembly listing is the same as yours)

vincentbernat commented 8 years ago

Can I try again with a 4.2?

jpoimboe commented 8 years ago

Sure. BTW was this on a VM?

vincentbernat commented 8 years ago

Yes, this is in a VM.

I now have gcc-5.2 on my system and I run into additional difficulties. When I try to run kpatch-build:

Testing patch file
checking file fs/proc/meminfo.c
Reading special section data
ERROR: can't find special struct size.

I could try to use gcc-4.9 but it seems that kpatch-build doesn't honor the CC variable (and there is some other places where gcc seem to be hardcoded when building kpatch).

I am a bit low on time currently. Feel free to postpone/close the issue if things don't add up. It is likely that I screw up at some point.

jpoimboe commented 8 years ago

Gah, sorry. I've made a lot of fixes in the last week and I probably broke Debian. If you have a chance to try, I think reverting commit 170449847136a48b19fcceb19c1d4d257d386b56 will fix your new problem.

jpoimboe commented 8 years ago

You could also try doing bash -x kpatch-build instead of kpatch-build to show exactly what command is failing.

cbay commented 8 years ago

I have the same issue when CONFIG_DEBUG_SET_MODULE_RONX is not set. When it's set, the bug goes away. I'm using livepatch.

Tested on 4.1, 4.2 and 4.3, on Debian 8 (with a custom kernel). By default, Debian has CONFIG_DEBUG_SET_MODULE_RONX=y.

jpoimboe commented 8 years ago

@cbay, thanks for isolating the problem. I should have a fix soon.

jpoimboe commented 8 years ago

@cbay any chance you can test with the following patch?

diff --git a/kmod/core/core.c b/kmod/core/core.c
index 7c16c79..2fa3630 100644
--- a/kmod/core/core.c
+++ b/kmod/core/core.c
@@ -649,6 +649,10 @@ static int kpatch_write_relocations(struct kpatch_module *kpmod,
                        return -EINVAL;
                }

+#ifndef CONFIG_DEBUG_SET_MODULE_RONX
+               readonly = 0;
+#endif
+
                numpages = (PAGE_SIZE - (loc & ~PAGE_MASK) >= size) ? 1 : 2;

                if (readonly)
jpoimboe commented 8 years ago

Oh, you're using livepatch? Guess we need a fix there as well.

jpoimboe commented 8 years ago

(and this patch won't help you)

cbay commented 8 years ago

Yes, I'm using livepatch. If I remember correctly, I did test with the kpatch module and it didn't cause the bug.

jpoimboe commented 8 years ago

Assuming I understand the problem correctly, it affects both kpatch.ko and livepatch, but it's intermittent, and may be dependent on what patch you use as well as how memory is laid out.

It requires a dynamic relocation write being performed on the same page of memory that holds the struct module. After writing the dynamic relocation, we set the page to be read-only, which causes the next write to the module struct "mod->state = MODULE_STATE_LIVE" to fail.

cbay commented 8 years ago

OK. As far as I'm concerned, I was testing the example patch (modifying /proc/meminfo).

Let me know if you want me to test a patch for livepatch.

jpoimboe commented 8 years ago

@cbay can you try this livepatch fix?

diff --git a/arch/x86/kernel/livepatch.c b/arch/x86/kernel/livepatch.c
index ff3c3101d..d9578d5 100644
--- a/arch/x86/kernel/livepatch.c
+++ b/arch/x86/kernel/livepatch.c
@@ -70,10 +70,12 @@ int klp_write_module_reloc(struct module *mod, unsigned long type,
                /* loc does not point to any symbol inside the module */
                return -EINVAL;

+       readonly = false;
+
+#ifdef CONFIG_DEBUG_SET_MODULE_RONX
        if (loc < core + core_ro_size)
                readonly = true;
-       else
-               readonly = false;
+#endif

        /* determine if the relocation spans a page boundary */
        numpages = ((loc & PAGE_MASK) == ((loc + size) & PAGE_MASK)) ? 1 : 2;
jpoimboe commented 8 years ago

@cbay Also, if you care to send me your email address, I can credit you with "Reported-by" and "Tested-by" in the upstream livepatch commit log.

cbay commented 8 years ago

The patch fixes the issue for me. I tested it on Linux 4.3 (the patch didn't apply cleanly, though). Thanks!

My email is cbay at alwaysdata.com.