open-switch / opx-nas-daemon

https://openswitch.net
1 stars 9 forks source link

opx-switch-log crashing nas #24

Closed rakeshdatta closed 6 years ago

rakeshdatta commented 6 years ago
root@OPX:~# opx-show-system-status
System State:  running
No Failed Service

No Modified Package

root@OPX:~# ps -ef | grep opx
root       319     1 11 05:37 ?        00:07:43 /usr/bin/opx_eth_drv -c 0x3 -n 2 -- -p 1 -i 1 -o 2 -t eth
root       525     1  0 05:38 ?        00:00:00 /usr/bin/opx_cps_service
root       530     1  0 05:38 ?        00:00:00 /usr/bin/python /usr/bin/opx_env_tmpctl_svc
root       541     1  0 05:38 ?        00:00:13 /usr/bin/opx_pas_service
root       581     1  0 05:38 ?        00:00:00 /usr/bin/python /usr/bin/opx-alm-service
root       647     1  8 05:38 ?        00:05:33 /usr/bin/opx_nas_daemon
root      3828  1449  0 06:45 ttyS1    00:00:00 grep opx
root@OPX:~#
root@OPX:~# opx-switch-log set ROUTER_INTERFACE debug
{'data': {'base-switch/set_log/input/subsystem-id': bytearray(b'\t\x00\x00\x00'), 'base-switch/set_log/input/level': bytearray(b'\x00\x00\x00\x00')}, 'key': '1.36.2359343.'}
Failed
Traceback (most recent call last):
  File "/usr/bin/opx-switch-log", line 113, in <module>
    pid = get_pid("base_nas")
  File "/usr/bin/opx-switch-log", line 94, in get_pid
    return int(check_output(["pidof","-s",name]))
  File "/usr/lib/python2.7/subprocess.py", line 573, in check_output
    raise CalledProcessError(retcode, cmd, output=output)
subprocess.CalledProcessError: Command '['pidof', '-s', 'base_nas']' returned non-zero exit status 1
root@OPX:~#
root@OPX:~# opx-show-system-status
System State:  degraded
Failed Services
opx-nas.service
No Modified Package
root@OPX:~#
root@OPX:~# ps -ef | grep opx
root       319     1 11 05:37 ?        00:07:50 /usr/bin/opx_eth_drv -c 0x3 -n 2 -- -p 1 -i 1 -o 2 -t eth
root       525     1  0 05:38 ?        00:00:00 /usr/bin/opx_cps_service
root       530     1  0 05:38 ?        00:00:00 /usr/bin/python /usr/bin/opx_env_tmpctl_svc
root       541     1  0 05:38 ?        00:00:13 /usr/bin/opx_pas_service
root       581     1  0 05:38 ?        00:00:00 /usr/bin/python /usr/bin/opx-alm-service
root      4008  1449  0 06:46 ttyS1    00:00:00 grep opx
root@OPX:~#

==========================================================

LOGS:

Jun 06 06:45:44 OPX opx_nas_daemon[647]: /usr/bin/opx_nas_daemon: symbol lookup error: /usr/lib/x86_64-linux-gnu/libopx_nas_ndi.so.1: undefined symbol: sai_log_set
Jun 06 06:45:44 OPX nbased[585]: AUDIT 0x0309-86 (0000): I/f 0X00000015 OS sub-layer oper status changed from 1 to 2
Jun 06 06:45:44 OPX nbased[585]: AUDIT 0x0309-57 (0001): The sub layer oper status of i/f 0X00000015 changed from 0X01 to 0X02.
Jun 06 06:45:44 OPX nbased[585]: AUDIT 0x0309-56 (0001): The operational status of interface 0X00000015 has changed from 0X01 to 0X02.
Jun 06 06:45:44 OPX systemd[1]: opx-nas.service: main process exited, code=exited, status=127/n/a
Jun 06 06:45:44 OPX systemd[1]: Unit opx-nas.service entered failed state.
Jun 06 06:45:44 OPX systemd[1]: Triggering OnFailure= dependencies of opx-nas.service.
Jun 06 06:45:44 OPX systemd[1]: Failed to enqueue OnFailure= job: Invalid argument
Jun 06 06:45:45 OPX nbased[585]: AUDIT 0x0309-86 (0000): I/f 0X00000014 OS sub-layer oper status changed from 1 to 2
Jun 06 06:45:45 OPX nbased[585]: AUDIT 0x0309-57 (0001): The sub layer oper status of i/f 0X00000014 changed from 0X01 to 0X02.
Jun 06 06:45:45 OPX nbased[585]: AUDIT 0x0309-56 (0001): The operational status of interface 0X00000014 has changed from 0X01 to 0X02.
rakeshdatta commented 6 years ago

All the desired SAI APIs were not globally exported. Fixed that now. It should be available in the next release.