beagleboard / linux

The official Read Only BeagleBoard and BeagleBone kernel repository https://git.beagleboard.org/beagleboard/linux
http://beagleboard.org/source
Other
715 stars 566 forks source link

kernel BUG at /home/joey/git/linux/fs/sysfs/group.c:113 #165

Closed jharvell closed 4 years ago

jharvell commented 6 years ago

Kernel Oops triggered on a Beaglebone Black by the following action:

pip /sys/kernel/debug/dynamic_debug # echo am33xx-bandgap > /sys/devices/platform/bone_capemgr/slots
Segmentation fault

Kernel oops below is from 4.14.37-ti-r46 (https://github.com/beagleboard/linux.git) plus minor unrelated debugging changes that I will detail in a separate comment. The bug is caused by calling sysfs_create_groups with a kobject that has a NULL sysfs directory entry (kobj-sd). I looked at the commit that introduced the call to sysfs_create_groups in April, 2015, and it looks like it would hit this bug every time.

The overlay I am adding is very simple, and I will provide details on that in another comment also.

[  159.387919] bone_capemgr bone_capemgr: part_number 'am33xx-bandgap', version 'N/A'
[  159.396146] bone_capemgr bone_capemgr: slot #4: override
[  159.405955] bone_capemgr bone_capemgr: Using override eeprom data at slot 4
[  159.417353] bone_capemgr bone_capemgr: slot #4: 'Override Board Name,00A0,Override Manuf,am33xx-bandgap'
[  159.433324] of_fill_overlay_info:458: OF: overlay: Failed to find child with name __overlay__ for node __fixups__
[  159.442705] ------------[ cut here ]------------
[  159.447377] kernel BUG at /home/joey/git/linux/fs/sysfs/group.c:113!
[  159.453760] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM
[  159.459621] Modules linked in: nls_ascii nls_cp437 evdev uio_pdrv_genirq uio sch_fq_codel
[  159.467870] CPU: 0 PID: 225 Comm: bash Not tainted 4.14.37+ #5
[  159.473728] Hardware name: Generic AM33XX (Flattened Device Tree)
[  159.479850] task: db56de00 task.stack: db2ac000
[  159.484415] PC is at internal_create_group+0x2e4/0x30c
[  159.489577] LR is at sysfs_create_groups+0x54/0x90
[  159.494387] pc : [<c03940ac>]    lr : [<c0394514>]    psr: 60010113
[  159.500680] sp : db2adce8  ip : db2add30  fp : db2add2c
[  159.505925] r10: 00000004  r9 : c11444fc  r8 : db49bc20
[  159.511171] r7 : 00000001  r6 : db62db50  r5 : 00000000  r4 : 00000000
[  159.517726] r3 : 4c42a061  r2 : db62db50  r1 : 00000000  r0 : db49bc20
[  159.524284] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
[  159.531450] Control: 10c5387d  Table: 96870019  DAC: 00000051
[  159.537220] Process bash (pid: 225, stack limit = 0xdb2ac218)
[  159.542991] Stack: (0xdb2adce8 to 0xdb2ae000)
[  159.547371] dce0:                   c0b95274 c0b909e8 db776980 4c42a061 db776c40 00000000
[  159.555588] dd00: db776c40 00000000 db776180 db49bc20 00000001 db776180 c11444fc 00000004
[  159.563804] dd20: db2add54 db2add30 c0394514 c0393dd4 db49bc00 df95de28 00000044 00000001
[  159.572020] dd40: c0b9a088 c11444fc db2addec db2add58 c0b9b328 c03944cc 00000000 014000c0
[  159.580237] dd60: 00000000 c0b99ec4 db2add94 db2add78 db49bc04 c15d9a30 db49bc20 600b0013
[  159.588452] dd80: df95de28 c1504dc8 00000000 db49bc18 db2addac c11cc144 00000000 c11cc1b4
[  159.596669] dda0: 00000008 00000000 db537800 00000008 db57e010 0000007f 00000070 4c42a061
[  159.604886] ddc0: 00000000 db57e010 0000007f dc5cc340 c1504dc8 00000000 e04c3064 db57e25b
[  159.613103] dde0: db2addfc db2addf0 c0b9b5bc c0b9aa3c db2ade74 db2ade00 c097789c c0b9b5a8
[  159.621320] de00: 00000000 db2ade10 c03940f4 c0978900 db2ade74 db2ade20 c0978900 c0dbe978
[  159.629536] de20: ffffffea dc23e410 db57e25b dc5cde10 db537800 dc23e410 dc23e410 00000004
[  159.637753] de40: c0946478 4c42a061 db2ade70 db776b40 c1504dc8 dc5cde10 db776880 dc23e410
[  159.645970] de60: 00000000 db57e010 db2adeb4 db2ade78 c0978c2c c0977648 00000000 db2ade88
[  159.654187] de80: c016ead4 4c42a061 c1504dc8 c0978ad8 db49b980 00000000 00000000 db2adf68
[  159.662404] dea0: db776880 db49b990 db2adecc db2adeb8 c09448c4 c0978ae4 c094489c db49b980
[  159.670621] dec0: db2adee4 db2aded0 c0392d64 c09448a8 0000000f db49b980 db2adf1c db2adee8
[  159.678837] dee0: c0392458 c0392d28 00000000 00000000 dc68dd88 c0392360 db62e540 01dd4468
[  159.687053] df00: db2adf68 00000000 01dd4468 0000000f db2adf34 db2adf20 c0306244 c039236c
[  159.695271] df20: 0000000f db62e540 db2adf64 db2adf38 c0306428 c0306228 db62e540 c0328e00
[  159.703487] df40: c1504dc8 db62e540 00000000 00000000 db62e540 01dd4468 db2adfa4 db2adf68
[  159.711704] df60: c0306690 c0306380 00000000 00000000 c0328cc0 4c42a061 00000001 0000000f
[  159.719921] df80: b6f0fa60 b6e4adb0 00000004 c01096e4 db2ac000 00000000 00000000 db2adfa8
[  159.728137] dfa0: c0109500 c0306640 0000000f b6f0fa60 00000001 01dd4468 0000000f c8e05600
[  159.736353] dfc0: 0000000f b6f0fa60 b6e4adb0 00000004 0000000f b6f0fa60 b6e487a8 004dafe8
[  159.744569] dfe0: 00000000 bebb1248 00000150 b6dc5368 600b0010 00000001 00000000 00000000
[  159.752799] [<c03940ac>] (internal_create_group) from [<c0394514>] (sysfs_create_groups+0x54/0x90)
[  159.761809] [<c0394514>] (sysfs_create_groups) from [<c0b9b328>] (__of_overlay_create+0x8f8/0xb6c)
[  159.770812] [<c0b9b328>] (__of_overlay_create) from [<c0b9b5bc>] (of_overlay_create+0x20/0x24)
[  159.779472] [<c0b9b5bc>] (of_overlay_create) from [<c097789c>] (capemgr_load_slot+0x260/0x5d4)
[  159.788128] [<c097789c>] (capemgr_load_slot) from [<c0978c2c>] (slots_store+0x154/0x328)
[  159.796272] [<c0978c2c>] (slots_store) from [<c09448c4>] (dev_attr_store+0x28/0x34)
[  159.803979] [<c09448c4>] (dev_attr_store) from [<c0392d64>] (sysfs_kf_write+0x48/0x54)
[  159.811936] [<c0392d64>] (sysfs_kf_write) from [<c0392458>] (kernfs_fop_write+0xf8/0x1dc)
[  159.820156] [<c0392458>] (kernfs_fop_write) from [<c0306244>] (__vfs_write+0x28/0x48)
[  159.828025] [<c0306244>] (__vfs_write) from [<c0306428>] (vfs_write+0xb4/0x1c0)
[  159.835369] [<c0306428>] (vfs_write) from [<c0306690>] (SyS_write+0x5c/0xbc)
[  159.842460] [<c0306690>] (SyS_write) from [<c0109500>] (ret_fast_syscall+0x0/0x54)
[  159.850071] Code: e5941000 eaffff84 e3550000 1affff55 (e7f001f2) 
[  159.856198] ---[ end trace f5778303a2d4689e ]---
jharvell commented 6 years ago

Kernel details:

https://github.com/beagleboard/linux.git

commit 17ceea78d1fc0c6883297ba3d261b5d6f3c976b3 (tag: 4.14.37-ti-r46, bb/4.14)
Author: Robert Nelson <robertcnelson@gmail.com>
Date:   Thu Apr 26 15:37:56 2018 -0500

    4.14.37-ti-r46 bb.org_defconfig (plus debugging changes)

    4.14 TI Delta: https://github.com/RobertCNelson/ti-linux-kernel/compare/19d05d51d321cf13295e8a75c2ab3a62ec91220f...ed7732d6de03523b123073dedf6aaf89733d8d96

    Signed-off-by: Robert Nelson <robertcnelson@gmail.com>

debugging changes:

jharvell@wolfhound of$ git --no-pager diff 4.14.37-ti-r46
diff --git a/drivers/of/overlay.c b/drivers/of/overlay.c
index 7f8cc5c1f426..2a111ea24f8a 100644
--- a/drivers/of/overlay.c
+++ b/drivers/of/overlay.c
@@ -454,7 +454,10 @@ static int of_fill_overlay_info(struct of_overlay *ov,
 {
        ovinfo->overlay = of_get_child_by_name(info_node, "__overlay__");
        if (ovinfo->overlay == NULL)
+       {
+               pr_debug("Failed to find child with name __overlay__ for node %s\n", info_node->name);
                goto err_fail;
+       }

        ovinfo->target = find_target_node(ov, info_node, ov->target_index);
        if (ovinfo->target == NULL)
@@ -565,7 +568,10 @@ static int of_build_overlay_info(struct of_overlay *ov,
                ovinfo->attrs[1] = NULL;

                /* NOTE: direct reference to the full_name */
-               ovinfo->attr_group.name = kbasename(ovinfo->info->full_name);
+               if(ovinfo->info == NULL)
+                       pr_debug("No info found for child node %d of %d for node %s\n", i, cnt, tree->name);
+               else
+                       ovinfo->attr_group.name = kbasename(ovinfo->info->full_name);
                ovinfo->attr_group.attrs = ovinfo->attrs;

        }
jharvell commented 6 years ago

Overlay sources below.

I built them by putting the source file in the src/arm directory of bb.org-overlays (git@github.com:beagleboard/bb.org-overlays.git) and running make in the top level directory of the repository.

bb.org-overlays version:

commit 604c0926a4f7505dfc3d501301413c821e59febe (origin/master, origin/HEAD, master)
Author: Robert Nelson <robertcnelson@gmail.com>
Date:   Tue Apr 24 08:41:58 2018 -0500

    universal rewrite: spi0-xyz and spidev1x

    Signed-off-by: Robert Nelson <robertcnelson@gmail.com>

New overlay that I was loading to elicit the crash

joey@akita arm$ cat am33xx-bandgap-00A0.dts
/*
 * Enumerate the coarse builtin bandgap sensor on the AM33xx.
 * This sensor can measure temperatures between -50 and 150 deg Celsius.
 * But due to the 8-bit ADC, the resolution is 10.5 deg Celsius.
 * Refer to the section "Measuring Case Temperature" at
 * http://processors.wiki.ti.com/index.php/AM335x_Thermal_Considerations#Measuring_Board_Temperature
 * and also https://e2e.ti.com/support/arm/sitara_arm/f/791/t/393235?tisearch=e2e-quicksearch&keymatch=ti,am335x-bandgap
 *
 * It looks like this sensor was removed from the official device trees due
 * to the low resolution.
 */
/dts-v1/;
/plugin/;

/{
        compatible = "ti,am33xx", "ti,beaglebone", "ti,beaglebone-black", "ti,beaglebone-green", "ti,am335x-bone-black", "ti,am335x-bone", "ti,am335x-bone-green-wireless", "ti,am335x-bone-green";

        fragment@0
        {
                target = <&ocp>;
                __overlay__
                {
                        #address-cells = <1>;
                        #size-cells = <1>;

                        bandgap@44e10448
                        {
                                compatible = "ti,am335x-bandgap";
                                reg = <0x44e10448 0x8>;
                        };
                };
        };
};
jharvell commented 6 years ago

When I triggered the bug, I had enabled dynamic debugging for the overlay.c file as follows:

pip /sys/kernel/debug/dynamic_debug # echo "file overlay.c +plf" > control
pip /sys/kernel/debug/dynamic_debug # echo am33xx-bandgap > /sys/devices/platform/bone_capemgr/slots
Segmentation fault
jharvell commented 6 years ago

This is the commit that introduced the call to sysfs_create_groups:

commit 6ac054d71b6fd5af7e51a90b9a910f2205604d4d
Author: Pantelis Antoniou <pantelis.antoniou@konsulko.com>
Date:   Thu Apr 23 19:02:16 2015 +0300

    of: overlay: add per overlay sysfs attributes

    * A per overlay can_remove sysfs attribute that reports whether
    the overlay can be removed or not due to another overlapping overlay.

    * A target sysfs attribute listing the target of each fragment,
    in a group named after the name of the fragment.

    Signed-off-by: Pantelis Antoniou <pantelis.antoniou@konsulko.com>
    Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
jharvell commented 6 years ago

confirmed issue still exists after rebasing to 4.14.39-ti-r47

RobertCNelson commented 6 years ago

Yeah, i missed your patch when working thru my sick backlog from last week.

btw, does it work as a u-boot overlay?

v4.14.x is the end of our "/sys/devices/platform/bone_capemgr/slots" technique, to many kernel bugs no maintainers to assist.. So everything is done in u-boot going forward.

Regards,

jharvell commented 6 years ago

Yeah, I had a feeling I was the only one trying to make this work :) I'll start investigating the u-boot method instead. Thanks.

pdp7 commented 4 years ago

@jharvell @RobertCNelson any feedback on if this works with u-boot overlay?

If not, I can try it with our current Debian image.

pdp7 commented 4 years ago

Please re-open if this is still an issue with u-boot overlay: https://elinux.org/Beagleboard:BeagleBoneBlack_Debian#U-Boot_Overlays

You may also be interested in the Debian images and kernel builds that we are currently testing for the next release: https://elinux.org/Beagleboard:Latest-images-testing