Open srideepDell opened 1 year ago
SIGBUS error indicates that there is some issue in accessing the mmaped region. Are you able to do PCI r/w in configuration and memory mapped region?
@prgeor yes we are able to access the PCI configuration and memory mapped region. the gdb traces/logs are just to indicate issue is in libsai threads which we will not be able to debug since platform vendors don't have access to it.
Logs indicate chip is detected initialized and some callbacks from the sdk is not returning and timing out. similar issues are reported by other HW vendors too https://github.com/sonic-net/sonic-buildimage/issues/8216
Aug 31 23:24:05.899421 sonic INFO syncd#supervisord: syncd Found 1 device.#015
Aug 31 23:24:05.899522 sonic INFO syncd#supervisord: syncd Unit 0: BCM56881#015
Aug 31 23:24:05.899556 sonic INFO syncd#supervisord: syncd NGBDE unit 0 (PCI), Dev 0xb881, Rev 0x11, Chip BCM56881_B0#015
Aug 31 23:24:05.915407 sonic INFO syncd#supervisord: syncd cp: cannot stat '/var/warmboot/brcm_bcm_scache': No such file or directory#015
Aug 31 23:24:05.950422 sonic INFO syncd#supervisord: syncd rc: unit 0 device BCM56881_B0#015
@prgeor With the latest master image we are seeing different failures in TD4 bring up. TD4_Community_Sonic_logs.txt
Logs to be noted..
`
Apr 25 05:59:10.616460 sonic INFO syncd#supervisord: syncd Found 1 device.# 015
Apr 25 05:59:10.616460 sonic INFO syncd#supervisord: syncd Unit 0: BCM56881#015
Apr 25 05:59:10.617467 sonic INFO syncd#supervisord: syncd NGBDE unit 0 (PCI), Dev 0xb881, Rev 0x11, Chip BCM56881_B0#015
Apr 25 05:59:10.649398 sonic INFO syncd#supervisord: syncd rc: unit 0 device BCM56881_B0#015
Apr 25 05:59:10.662513 sonic INFO syncd#supervisord: syncd
`
Apr 25 05:59:12.935999 sonic WARNING kernel: [ 87.357839] Unmatched linux_ngknet.ko, please use the latest.
Apr 25 05:59:13.629954 sonic CRIT syncd#syncd: [none] SAI_API_SWITCH:brcm_sai_create_switch:3204 setting inter-frame gap failed with error Feature unavailable (0xfffffff0).
Description
current version of SAI/SDK supports BCM56881_B0. While bring up we did hit issue of syncd exiting while configuring npu. Logs and other details attached below.
Steps to reproduce the issue:
Describe the results you received:
syncd crash seen Logs from syncd.
Core file bt
Describe the results you expected:
Expected for the NPU to come up and launch brcm sdk shell.
Output of
show version
: