leosac / access-control

Leosac Access Control - Open Source Physical Access Control System
https://leosac.com
GNU Affero General Public License v3.0
240 stars 40 forks source link

0.6.3 - SEGV when sysfsgpio output default value set to true #100

Closed knight-of-ni closed 6 years ago

knight-of-ni commented 6 years ago

Raspberry Pi Zero W Raspbian Jessie Leosac 0.6.3 installed via deb from the releases page Using SysFS GPIO module

Relevant data is shown below.

Description: Upon adding led & beeper control to leosac, the beeper in the reader immediately went on and the led tuned green. Apparently, these functions are active low. In order to reverse the logic, I attempted to set the default value for these two outputs to 1. When I set a default value to either "1" or "true", leosacd crashes on startup with 11/SEGV.

Unfortunately the log entries shown below are not particularly helpful. Leosac just dies after setting up the gpios. If I set the default value to 0 or false, leosac starts up again.... along with the beeper in the card reader.

I can fire up valgrind to try to get more information if it is not obvious why the seg fault is occurring. Let me know.

OS:

PRETTY_NAME="Raspbian GNU/Linux 8 (jessie)"
NAME="Raspbian GNU/Linux"
VERSION_ID="8"
VERSION="8 (jessie)"
ID=raspbian
ID_LIKE=debian
HOME_URL="http://www.raspbian.org/"
SUPPORT_URL="http://www.raspbian.org/RaspbianForums"
BUG_REPORT_URL="http://www.raspbian.org/RaspbianBugs"

Relevant portion of kernel.xml:

        <module>
            <name>SYSFS_GPIO</name>
            <file>libsysfsgpio.so</file>
            <level>2</level>
            <module_config>            
                <aliases>
                    <default>gpio__NO__</default>
                </aliases>
                <export_path>/sys/class/gpio/export</export_path>
                <unexport_path>/sys/class/gpio/unexport</unexport_path>
                <edge_path>/sys/class/gpio/__PLACEHOLDER__/edge</edge_path>
                <value_path>/sys/class/gpio/__PLACEHOLDER__/value</value_path>
                <direction_path>/sys/class/gpio/__PLACEHOLDER__/direction</direction_path>                
                <gpios>
                    <gpio>
                        <name>ac_granted_relay</name>
                        <no>19</no>
                        <direction>out</direction>
                        <value>false</value>
                    </gpio>
                    <gpio>
                        <name>wiegand_d0</name>
                        <no>22</no>
                        <direction>in</direction>
                        <interrupt_mode>falling</interrupt_mode>
                    </gpio>
                    <gpio>
                        <name>wiegand_d1</name>
                        <no>23</no>
                        <direction>in</direction>
                        <interrupt_mode>falling</interrupt_mode>
                    </gpio>
                    <gpio>
                        <name>reader_led</name>
                        <no>24</no>
                        <direction>out</direction>
                        <value>true</value>
                    </gpio>
                    <gpio>
                        <name>reader_beep</name>
                        <no>25</no>
                        <direction>out</direction>
                        <value>true</value>
                    </gpio>
                </gpios>
            </module_config>
        </module>

Relevant log entries:

    85  Jan 24 07:59:32 pilock systemd[1]: Starting Leosac daemon...
    86  Jan 24 07:59:32 pilock systemd[1]: Started Leosac daemon.
    87  Jan 24 07:59:32 pilock leosac[21181]: Logger not set-up yet ! Will display log message as is.
    88  Jan 24 07:59:32 pilock leosac[21181]: Creating Leosac Kernel...
    89  Jan 24 07:59:32 pilock leosac: [2018-01-24 07:59:32.430] [syslog] [info] Network configuration disabled. Will not touch system setting.
    90  Jan 24 07:59:32 pilock leosac: [2018-01-24 07:59:32.435] [syslog] [debug] Adding {/usr/lib/leosac} in library path
    91  Jan 24 07:59:32 pilock leosac: [2018-01-24 07:59:32.439] [syslog] [debug] Adding {/usr/lib/leosac/auth} in library path
    92  Jan 24 07:59:32 pilock leosac: [2018-01-24 07:59:32.443] [syslog] [info] Attempting to load module nammed SYSFS_GPIO (shared lib file = libsysfsgpio.so)
    93  Jan 24 07:59:32 pilock leosac: [2018-01-24 07:59:32.444] [syslog] [info] Loading library at: /usr/lib/leosac/libsysfsgpio.so
    94  Jan 24 07:59:32 pilock leosac: [2018-01-24 07:59:32.451] [syslog] [debug] library file loaded (not init yet)
    95  Jan 24 07:59:32 pilock leosac: [2018-01-24 07:59:32.452] [syslog] [info] Attempting to load module nammed WIEGAND_READER (shared lib file = libwiegand.so)
    96  Jan 24 07:59:32 pilock leosac: [2018-01-24 07:59:32.452] [syslog] [info] Loading library at: /usr/lib/leosac/libwiegand.so
    97  Jan 24 07:59:32 pilock leosac: [2018-01-24 07:59:32.465] [syslog] [debug] library file loaded (not init yet)
    98  Jan 24 07:59:32 pilock leosac: [2018-01-24 07:59:32.465] [syslog] [info] Attempting to load module nammed AUTH_FILE (shared lib file = libauth-file.so)
    99  Jan 24 07:59:32 pilock leosac: [2018-01-24 07:59:32.466] [syslog] [info] Loading library at: /usr/lib/leosac/auth/libauth-file.so
   100  Jan 24 07:59:32 pilock leosac: [2018-01-24 07:59:32.471] [syslog] [debug] library file loaded (not init yet)
   101  Jan 24 07:59:32 pilock leosac: [2018-01-24 07:59:32.472] [syslog] [info] Attempting to load module nammed DOORMAN (shared lib file = libdoorman.so)
   102  Jan 24 07:59:32 pilock leosac: [2018-01-24 07:59:32.473] [syslog] [info] Loading library at: /usr/lib/leosac/libdoorman.so
   103  Jan 24 07:59:32 pilock leosac: [2018-01-24 07:59:32.493] [syslog] [debug] library file loaded (not init yet)
   104  Jan 24 07:59:32 pilock leosac: [2018-01-24 07:59:32.497] [syslog] [info] SysFsGpio Path Configuration:#012#011Export path: gpio__NO__
   105  Jan 24 07:59:32 pilock leosac: [2018-01-24 07:59:32.498] [syslog] [info] Creating GPIO
   106  Jan 24 07:59:32 pilock leosac: [2018-01-24 07:59:32.500] [syslog] [info] Creating GPIO
   107  Jan 24 07:59:32 pilock leosac: [2018-01-24 07:59:32.502] [syslog] [info] Creating GPIO
   108  Jan 24 07:59:32 pilock leosac: [2018-01-24 07:59:32.504] [syslog] [info] Creating GPIO
   109  Jan 24 07:59:32 pilock systemd[1]: leosacd.service: main process exited, code=killed, status=11/SEGV
   110  Jan 24 07:59:32 pilock systemd[1]: Unit leosacd.service entered failed state.
xaqq commented 6 years ago

Hey,

Yes it would be helpful if you could run it under valgrind (maybe be very slow, especially on pi zero) or gdb. While its possible this is a bug in the code, its possible its a binary compatibility issue.

I have never tried running Leosac on the Pi Zero though, so it might be an other issue.

I know I have recommended people to run 0.6.3 because it has less dependency and is easier to build and run, however it seems its not really the case anymore.

The way forward in the short term is probably one of those:

knight-of-ni commented 6 years ago

I ran valgrind last night, and immediately remembered I don't have the debug symbols. I can't say I'm an expert at reading these, but I'm not sure how useful this info is. Valgrind stopped at an un-recognized instruction in /usr/lib/arm-linux-gnueabihf/libarmmem.so.

That library is part of raspi-copies-and-fills package and that package has an interesting description:

ARM-accelerated versions of selected functions from string.h
 Optimized implementations of memset, memcpy, memcmp and memmove for
 ARM1176JZF-S and Cortex A7

So it's conceivable, to me anyway, that the copy of valgrind I have does not know about these modified instructions. That means option number 2 from the output below is conceivable, and this is a red herring.

Let me know if you've got any other insights into what is going on.

--2688-- REDIR: 0x4dd3b20 (rindex) redirected to 0x4835ef8 (rindex)
--2688-- REDIR: 0x4dd3630 (strlen) redirected to 0x48366ec (strlen)
--2688-- REDIR: 0x4dd4bf0 (bcmp) redirected to 0x483875c (bcmp)
--2688-- REDIR: 0x4dd2f40 (strcmp) redirected to 0x483739c (strcmp)
--2688-- REDIR: 0x4c22398 (operator new(unsigned int, std::nothrow_t const&)) redirected to 0x4834108 (operator new(unsigned int, std::nothrow_t const&))
--2688-- REDIR: 0x4dced98 (malloc) redirected to 0x48338ec (malloc)
--2688-- REDIR: 0x4dcf8d0 (calloc) redirected to 0x48356b8 (calloc)
==2688== Use of uninitialised value of size 4
==2688==    at 0x400AA7C: _dl_lookup_symbol_x (dl-lookup.c:729)
==2688==    by 0x400F9C3: _dl_fixup (dl-runtime.c:111)
==2688==    by 0x401655B: _dl_runtime_resolve (dl-trampoline.S:57)
==2688==    by 0x48A320B: ??? (in /usr/lib/libleosac_lib.so)
==2688==  Uninitialised value was created by a stack allocation
==2688==    at 0x4001BE4: handle_ld_preload (rtld.c:983)
==2688== 
--2688-- REDIR: 0x4c22310 (operator new(unsigned int)) redirected to 0x4833ea8 (operator new(unsigned int))
disInstr(arm): unhandled instruction: 0xF4A174DD
                 cond=15(0xF) 27:20=74(0x4A) 4:4=1 3:0=13(0xD)
==2688== valgrind: Unrecognised instruction at address 0x4844f2c.
==2688==    at 0x4844F2C: ??? (in /usr/lib/arm-linux-gnueabihf/libarmmem.so)
==2688== Your program just tried to execute an instruction that Valgrind
==2688== did not recognise.  There are two possible reasons for this.
==2688== 1. Your program has a bug and erroneously jumped to a non-code
==2688==    location.  If you are running Memcheck and you just saw a
==2688==    warning about a bad jump, it's probably your program's fault.
==2688== 2. The instruction is legitimate but Valgrind doesn't handle it,
==2688==    i.e. it's Valgrind's fault.  If you think this is the case or
==2688==    you are not sure, please let us know and we'll try to fix it.
==2688== Either way, Valgrind will now raise a SIGILL signal which will
==2688== probably kill your program.
==2688== 
==2688== Process terminating with default action of signal 4 (SIGILL)
==2688==  Illegal opcode at address 0x4844F2C
==2688==    at 0x4844F2C: ??? (in /usr/lib/arm-linux-gnueabihf/libarmmem.so)
--2688-- REDIR: 0x4dcf54c (free) redirected to 0x4834a88 (free)
--2688-- REDIR: 0x4dd5380 (memset) redirected to 0x4838c94 (memset)
==2688== Use of uninitialised value of size 4
==2688==    at 0x4E66934: free_mem (in /lib/arm-linux-gnueabihf/libc-2.19.so)
==2688==    by 0x4E663C3: __libc_freeres (in /lib/arm-linux-gnueabihf/libc-2.19.so)
==2688==    by 0x4023633: _vgnU_freeres (vg_preloaded.c:61)
==2688==    by 0x7FFFFFFF: ???
==2688==  Uninitialised value was created by a stack allocation
==2688==    at 0x4001BE4: handle_ld_preload (rtld.c:983)

The way forward in the short term is probably one of those:

  • We can reproduce a clean build on the Pi of develop branch, as you did on Jessie, but hopefully we can do it on Stretch too. Then document this properly.

Yes, totally agree. I've gone back to figuring out how to build the develop branch. I'll get there. It's just been a little bumpy.

knight-of-ni commented 6 years ago

Been bouncing back and forth between 0.6.3 release and the develop branch.

I had a little success today. With a few modifications, I was able to build 0.6.3 release deb and install it (see #100 - would be glad to put that in the wiki if you want). Unfortunately when I try to run it, it crashes in the same manner as the deb hosted on the github releases page, so I think that rules out binary incompatibility.

knight-of-ni commented 6 years ago

After switching to a Raspberry Pi 2 and installing the 0.7.0 pre-release, leosac continues to crash when I try to assign a default value of 1 to any gpio configured as an output, using the sysfsgpio module.

This appears to be easily reproducible. From a working system, open kernel xml and add <value>true</value> or <value>1</value> to the gpio which represents the access granted output. Then restart the service. It will immediately fail with SEGV. At least it does for me.

xaqq commented 6 years ago

Indeed I can reproduce too.

I've identified the issue and will commit a fix later today. If you're in a hurry you can comment this line: https://github.com/leosac/leosac/blob/develop/src/modules/sysfsgpio/SysFSGPIOPin.cpp#L128

Turns out it was not binary compat issue, just bad code :)