nerves-project / nerves_system_bbb

Base Nerves system configuration for the BeagleBone-based boards
Apache License 2.0
36 stars 39 forks source link

GPIO assignment scrambled in v2.15.0 #266

Closed insasec closed 2 years ago

insasec commented 2 years ago

Environment

Elixir 1.13.4 (compiled with Erlang/OTP 25)

* Nerves environment: (`mix nerves.env --info`) **(This is from the working `v2.14.0` based system!)**

==> nerves ==> gpio |nerves_bootstrap| Environment Package List

Pkg: nerves_system_bbb Vsn: 2.14.0 Type: system BuildRunner: {Nerves.Artifact.BuildRunners.Local, [make_args: ["source", "all", "legal-info"]]}

Pkg: nerves_system_br Vsn: 1.19.0 Type: system_platform BuildRunner: {nil, []}

Pkg: nerves_toolchain_armv7_nerves_linux_gnueabihf Vsn: 1.5.0 Type: toolchain BuildRunner: {Nerves.Artifact.BuildRunners.Local, []}

Pkg: nerves_toolchain_ctng Vsn: 1.8.5 Type: toolchain_platform BuildRunner: {nil, []}

|nerves_bootstrap| Loadpaths Start

Nerves environment MIX_TARGET: bbb MIX_ENV: dev

|nerves_bootstrap| Environment Variable List target: bbb toolchain: /home/udos/.nerves/artifacts/nerves_toolchain_armv7_nerves_linux_gnueabihf-linux_x86_64-1.5.0 system: /home/udos/.nerves/artifacts/nerves_system_bbb-portable-2.14.0 app: .

|nerves_bootstrap| Loadpaths End

* Additional information about your host, target hardware or environment that
  may help
  - Host System is Ubuntu 22. Erlang/Elixir is installed/built via `asdf`
  - Target is a Beagle Bone Black

### Current behavior

The assignment of logical GPIO pin numbers to the actual hardware is scrambled in `v2.15.0`. E.g. GPIO 49 on a BBB *should* be on Pin Header 9, Pin 23 - i.e. it's `label` in sysfs (`/sys/class/gpio/gpio49/label`) should equal `P9_23`. However in `v.2.15.0` it's `P8_34`. Also it's no just the label. Trying to access the GPIO for output doesn't work (no LED in my case).

### Expected behavior

Version `v2.14.0` does show the expected behavior. I.e. `/sys/class/gpio/gpio49/label` yields `P9_23`.

### How to replicate

I created a new project using the stock system (i.e. no custom system): https://github.com/insasec/nerves_scrambled_gpio

The project contains a simple CLI test which prints out and compares the actual and expected label of GPIO 49: https://github.com/insasec/nerves_scrambled_gpio/blob/078f032cb18e1344cf26f6df4556eece19b11838/lib/gpio.ex#L6

The commit using the stock `v2.15.0` system (https://github.com/insasec/nerves_scrambled_gpio/tree/scrambled-gpio) shows the scrambled GPIO assignment with the simple test above:

iex(1)> Gpio.test_gpio()
FAIL - Actual label (P8_34) does not match expected label (P9_23) for GPIO 49


The commit using a "hardcoded" `v2.14.0` system in `mix.exs` (https://github.com/insasec/nerves_scrambled_gpio/tree/fixed-gpio) does show the expected behaviour:

iex(1)> Gpio.test_gpio()
PASS - Actual label (P9_23) matches expected label (P9_23) for GPIO 49

fhunleth commented 2 years ago

The BBB has 4 banks of GPIOs. Prior to Linux 5.15, they were mapped like this:

iex(2)> cmd("ls -las /sys/class/gpio/gpiochip*")
     0 lrwxrwxrwx    1 root     root             0 Jun 30 15:57 /sys/class/gpio/gpiochip96 -> ../../devices/platform/ocp/48000000.interconnect/48000000.interconnect:segment@100000/481ae000.target-module/481ae000.gpio/gpio/gpiochip96
     0 lrwxrwxrwx    1 root     root             0 Jun 30 15:57 /sys/class/gpio/gpiochip64 -> ../../devices/platform/ocp/48000000.interconnect/48000000.interconnect:segment@100000/481ac000.target-module/481ac000.gpio/gpio/gpiochip64
     0 lrwxrwxrwx    1 root     root             0 Jun 30 15:57 /sys/class/gpio/gpiochip32 -> ../../devices/platform/ocp/48000000.interconnect/48000000.interconnect:segment@0/4804c000.target-module/4804c000.gpio/gpio/gpiochip32
     0 lrwxrwxrwx    1 root     root             0 Jun 30 15:57 /sys/class/gpio/gpiochip0 -> ../../devices/platform/ocp/44c00000.interconnect/44c00000.interconnect:segment@200000/44e07000.target-module/44e07000.gpio/gpio/gpiochip0

On Linux 5.15, it's like this:

iex(12)> cmd("ls -las /sys/class/gpio/gpiochip*")
     0 lrwxrwxrwx    1 root     root             0 Jul 25 00:00 /sys/class/gpio/gpiochip96 -> ../../devices/platform/ocp/44c00000.interconnect/44c00000.interconnect:segment@200000/44e07000.target-module/44e07000.gpio/gpio/gpiochip96
     0 lrwxrwxrwx    1 root     root             0 Jul 25 00:00 /sys/class/gpio/gpiochip64 -> ../../devices/platform/ocp/48000000.interconnect/48000000.interconnect:segment@100000/481ae000.target-module/481ae000.gpio/gpio/gpiochip64
     0 lrwxrwxrwx    1 root     root             0 Jul 25 00:00 /sys/class/gpio/gpiochip32 -> ../../devices/platform/ocp/48000000.interconnect/48000000.interconnect:segment@100000/481ac000.target-module/481ac000.gpio/gpio/gpiochip32
     0 lrwxrwxrwx    1 root     root             0 Jul 25 00:00 /sys/class/gpio/gpiochip0 -> ../../devices/platform/ocp/48000000.interconnect/48000000.interconnect:segment@0/4804c000.target-module/4804c000.gpio/gpio/gpiochip0
0

If you look hard enough, you'll see that they all rotated: gpiochip0 is now gpiochip96, gpiochip32 is now gpiochip0, etc.

I have no clue why yet. It could be someone trying to tell us to stop using /sys/class/gpio. I don't see any device tree changes that would do this.

insasec commented 2 years ago

It seems that Linux 5.15 really came through with the deprecation of the sysfs based GPIO access mentioned here: https://www.kernel.org/doc/Documentation/ABI/obsolete/sysfs-gpio

The "new" way of dealing with GPIOs are character devices in /dev. At least they are available in the stock v2.15.0 system.

iex(3)> cmd "ls /dev/gpiochip*"
/dev/gpiochip3
/dev/gpiochip2
/dev/gpiochip1
/dev/gpiochip0

One might even consider that this is not in issue in the BBB system ... but something that should be solved in Circuits.GPIO. However from memory I seem to remember that Circuits is using sysfs by default - except for RasPis.

fhunleth commented 2 years ago

I experimented with cdev and the "gpiochips" are rotated in it as well. For devices that label the GPIOs well, this could be figured out, but not all Beaglebones do that. The Beaglebone Green device trees, for example, have lots of unlabelled GPIOs. Even if they were labelled, I felt like I could only undo the "gpiochip" rotation if I inspected the device tree in Circuits.GPIO and that seems like a lot of work. I also don't know if the code that I would write would end up being BBB-specific and mess things up elsewhere. I really think this rotation of gpiochips was an accident.

Long story short, I looked at the Beaglebone Debian sources and they seem to still point to Linux 5.10. It seems safest to just revert back to 5.10, so that's what I did. Texas Instruments still hasn't officially switched patch Linux 5.10 either.

insasec commented 2 years ago

I experimented with cdev and the "gpiochips" are rotated in it as well.

What a bummer. I just started building a custom system with libgpiod enabled in BR. I'll paste the results/comparisons here for reference.

AFAIK libgpiod is supposed to be the new "standard" way - unless you access the cdevs directly. And if this is bogus as well then at least it's not just our problem.

I really think this rotation of gpiochips was an accident.

Definitely looks like it. I'm just wondering why nobody else is complaining. The regular BBB users wouldn't notice because Ångström Linux is still on an older kernel. But even the YOCTO community is silent on this ...

Long story short, I looked at the Beaglebone Debian sources and they seem to still point to Linux 5.10. It seems safest to just revert back to 5.10, so that's what I did. Texas Instruments still hasn't officially switched patch Linux 5.10 either.

I think that's really the best until we see if it was an accident and/or others start to notice/complain as well.

Sorry again for not bringing this up earlier: I still had this "main is unstable - there can be bugs in there - use a released version" mindset .

insasec commented 2 years ago

So I got the very same results using libgpio.

For reference:

The v2.14.0 output: https://github.com/insasec/nerves_scrambled_gpio/blob/libgpio-v2.14.0/test_result_v2.14.0.txt

The v2.15.0 output: https://github.com/insasec/nerves_scrambled_gpio/blob/libgpio-v2.15.0/test_result_v2.15.0.txt

Simple Diff: https://gist.github.com/insasec/99c84790698bb2ef7f939491a0588f94/revisions?diff=split

I think it's clear now that this is not a "subtle" reminder that SysFS support for GPIO is deprecated. This seems more like sth. gone wrong in the Kernel by accident.