Open TRIA opened 3 years ago
Note: when killing the IPCM with SIGKILL, if the IPCM is killed but there are still running IPCP(s), the IPCP(s) get inherited by init and no longer show up in a 'ps a'. If not killed, they cause trouble like blocking the unloading of the kernel modules.
(Updated information attached.)
This problem is likely related to problems seen on latest Raspberry Pi Raspbian, where using the same configuration files (except for normal.DIF ipcp process name and the ethernet port number) as a working earlier x86 system running the same stack, the Raspberry Pi does not properly assign the normal.DIF ipcp to the lan.DIF and subsequently does not proceed to n1difPeerDiscovery, which may be causing the IPCM to hang on exit.
Problem occurs on latest stack, system Raspberry Pi 4. System information: srb@razzy4:~ $ uname -a Linux razzy4 5.10.52-v7l+ #1441 SMP Tue Aug 3 18:11:56 BST 2021 armv7l GNU/Linux srb@razzy4:~ $ cat /etc/debian_version 10.10
This machine has failed to properly assign the normal.DIF ipcp to the lan.DIF. Output in IPCM log: 1164(1629821308)#ipcm (INFO)[create_ipcp]: IPC process lan.ipcp:1:: created and waiting for initialization[id = 1]
1164(1629821308)#ipcm.ipcp (INFO)[ipc_process_create_response_event_handler]: IPC process kernel components of [id = 1] created
1164(1629821308)#ipcm (INFO)[assign_to_dif]: Requested DIF assignment of IPC process lan.ipcp:1:: to DIF lan.DIF:::
1164(1629821308)#ipcm.ipcp (INFO)[assign_to_dif_response_event_handler]: DIF assignment operation completed for IPC process lan.ipcp:1:: [success=1]
1164(1629821308)#ipcm (INFO)[create_ipcp]: IPC process razzy4:1:: created and waiting for initialization[id = 2]
1164(1629821308)#ipcm.ipcp (INFO)[ipc_process_create_response_event_handler]: IPC process kernel components of [id = 2] created
1164(1629821308)#ipcm.ipcp (INFO)[ipc_process_daemon_initialized_event_handler]: IPC process daemon initialized [id = 2]
(ipcp 2 is never assigned to lan.DIF, it remains in INITIALIZED state)
On an older x86 machine using nearly-identical config files (only IPCP name and ethernet port number differ), the normal.DIF ipcp DOES get assigned to the lan.DIF, the machine enrolls with another similar machine similarly configured, and things are working fine. So there may be something hanging in the kernel in the failing case on Raspberry Pi with latest kernel that keeps IPCM from exiting.
srb@sdr2:~$ uname -a Linux sdr2 4.15.0-153-generic #160-Ubuntu SMP Thu Jul 29 06:54:29 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux srb@sdr2:~$ cat /etc/debian_version buster/sid srb@sdr2:~$
This machine is running normally, normal.DIF operating over lan.DIF, enrolled with another machine, flows allocate normally. The IPCM terminates normally upon receiving SIGINT.
The problem also occurs on Raspberry Pi W (working around the problem with finding shared objects, issue #1353, with LD_LIBRARY_PATH, and using wlan0 as if it's an ethernet port with shim-enet-vlan, which might not be expected to work but lets us proceed with the test) running latest Raspbian. This suggests that it is not related to the hardware version.
Configuration files for razzy4. Archive.zip
Configuration files for sdr2. sdr2-Archive.zip
(Sorry for the typo in the original title.) When the IPCM is sent a SIGINT, it properly responds with "IPCM loop requested to stop", but it does not terminate. It has to be killed with a -9 (SIGKILL) to actually terminate.