Closed pojiro closed 9 months ago
It's here: https://github.com/elixir-circuits/circuits_gpio/releases/tag/v1.2.0.
It had been used in production, so I'm not sure why you're getting that error.
It's here: https://github.com/elixir-circuits/circuits_gpio/releases/tag/v1.2.0.
Thanks, ok. i mistake to read the releases
on Github.
It had been used in production, so I'm not sure why you're getting that error.
I think It's caused by this, https://github.com/elixir-circuits/circuits_gpio/pull/142/files#diff-596f8a2c84f5216978df24afaedf32021989ed575e50d201755750080f0d3903
Defer loading, if open
is called twice or over, load_nif
is also called.
So calling load_nif
over one time may happen error always. But open
gpio each pin, i mean open other pin several time is usuall usage.
The unit tests open pins lots of times. Do you have an example that I could try to reproduce the issue? Circuits.I2C, Circuits.SPI and Circuits.GPIO v2 all use the same pattern for loading the NIF.
Also, in case you're wondering, I deleted the prereleases from GitHub releases. They did make it hard to see v1.2.0 and I wouldn't want anyone using prerelease versions now anyway.
The unit tests open pins lots of times. Do you have an example that I could try to reproduce the issue?
Yes, I will make simple reproduce case.
@fhunleth I made reproducible repo, https://github.com/pojiro/test_gpio .
This repo's firmware for rpi4 can reproduce this issue.
05:23:00.051 [info] {:error, {:reload, ~c"NIF library already loaded (reload disallowed since OTP 20)."}}, %{ref: nil, pin: 26}
05:23:00.052 [info] {:ok, #Reference<0.1163197885.843710481.257840>}, %{ref: nil, pin: 5}
05:23:00.052 [error] |=====> Stopping Elixir.TestGpio.GpioServer GenServer, reason: {{:badmatch, {:error, {:reload, ~c"NIF library already loaded (reload disallowed since OTP 20)."}}}, [{TestGpio.GpioServer, :handle_info, 2, [file: ~c"lib/gpio_server.ex", line: 25]}, {:gen_server, :try_handle_info, 3, [file: ~c"gen_server.erl", line: 1077]}, {:gen_server, :handle_msg, 6, [file: ~c"gen_server.erl", line: 1165]}, {:proc_lib, :init_p_do_apply, 3, [file: ~c"proc_lib.erl", line: 241]}]}
05:23:00.052 [info] {:error, {:reload, ~c"NIF library already loaded (reload disallowed since OTP 20)."}}, %{ref: nil, pin: 19}
05:23:00.052 [error] |=====> Stopping Elixir.TestGpio.GpioServer GenServer, reason: {{:badmatch, {:error, {:reload, ~c"NIF library already loaded (reload disallowed since OTP 20)."}}}, [{TestGpio.GpioServer, :handle_info, 2, [file: ~c"lib/gpio_server.ex", line: 25]}, {:gen_server, :try_handle_info, 3, [file: ~c"gen_server.erl", line: 1077]}, {:gen_server, :handle_msg, 6, [file: ~c"gen_server.erl", line: 1165]}, {:proc_lib, :init_p_do_apply, 3, [file: ~c"proc_lib.erl", line: 241]}]}
05:23:00.053 [info] {:error, {:reload, ~c"NIF library already loaded (reload disallowed since OTP 20)."}}, %{ref: nil, pin: 13}
05:23:00.053 [error] |=====> Stopping Elixir.TestGpio.GpioServer GenServer, reason: {{:badmatch, {:error, {:reload, ~c"NIF library already loaded (reload disallowed since OTP 20)."}}}, [{TestGpio.GpioServer, :handle_info, 2, [file: ~c"lib/gpio_server.ex", line: 25]}, {:gen_server, :try_handle_info, 3, [file: ~c"gen_server.erl", line: 1077]}, {:gen_server, :handle_msg, 6, [file: ~c"gen_server.erl", line: 1165]}, {:proc_lib, :init_p_do_apply, 3, [file: ~c"proc_lib.erl", line: 241]}]}
05:23:00.053 [info] {:error, {:reload, ~c"NIF library already loaded (reload disallowed since OTP 20)."}}, %{ref: nil, pin: 6}
05:23:00.053 [error] |=====> Stopping Elixir.TestGpio.GpioServer GenServer, reason: {{:badmatch, {:error, {:reload, ~c"NIF library already loaded (reload disallowed since OTP 20)."}}}, [{TestGpio.GpioServer, :handle_info, 2, [file: ~c"lib/gpio_server.ex", line: 25]}, {:gen_server, :try_handle_info, 3, [file: ~c"gen_server.erl", line: 1077]}, {:gen_server, :handle_msg, 6, [file: ~c"gen_server.erl", line: 1165]}, {:proc_lib, :init_p_do_apply, 3, [file: ~c"proc_lib.erl", line: 241]}]}
Oh! I understand. That's a race condition, and it's really obvious now that I see your test. I'm able to reproduce. I'll have a fix up soon.
While I fixed the NIF load race condition, the unit tests uncovered race conditions in the NIF. Those are the CI failures in the PR. Luckily, this is the stub implementation which has shared data that's subject to races and the real implementation shouldn't. However, I'm going to take time to review all of the code to fix these races and check for others now that I know what's happening and can easily reproduce. I suspect that the PRs will fix the issues for you. #180 fixes v1.2.0 and #179 fixes v2.0.0.
@pojiro Thank you so much for bringing this to my attention!
@fhunleth
Thanks for your explanation. Now i can understand what happens in race condition, following code was tricky. It changes its behavior between before load and after load. This issue happens when calling open
same time before load, It was interesting.
def open(pin_number, pin_direction, initial_value, pull_mode) do
with :ok <- load_nif() do
apply(__MODULE__, :open, [pin_number, pin_direction, initial_value, pull_mode])
end
end
And I confirmed maint-v1.x-fix-nif-race
resolves this issue, thank you so match!
v1.2.0 is not released on Github, but it is on hex.pm, https://hex.pm/packages/circuits_gpio/versions.
I confirmed the v1.2.0 has a issue on
open
, likeIf not released officially, how about retired it ?