Open TorrentialFire opened 1 year ago
The SwSS is not active, you may want to check the SyncD docker..
Can you share docker ps -a output?
Thanks.
Here is the output of docker ps --all
:
admin@sonic:~$ docker ps --all
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
5f2575f29394 docker-sonic-telemetry:latest "/usr/local/bin/supe…" 19 hours ago Up 19 hours telemetry
fad615c4f175 docker-sonic-mgmt-framework:latest "/usr/local/bin/supe…" 19 hours ago Up 19 hours mgmt-framework
f2fb3164424c docker-lldp:latest "/usr/bin/docker-lld…" 19 hours ago Up 19 hours lldp
f952e79283dc docker-platform-monitor:latest "/usr/bin/docker_ini…" 19 hours ago Up 19 hours pmon
ea8619a54edc docker-router-advertiser:latest "/usr/bin/docker-ini…" 2 months ago Up 2 months radv
168008e39980 docker-eventd:latest "/usr/local/bin/supe…" 2 months ago Up 2 months eventd
33cff425ea3b docker-database:latest "/usr/local/bin/dock…" 2 months ago Up 2 months database
And here's a grep for occurrences of syncd
in the syslog:
admin@sonic:~$ sudo cat /var/log/syslog | grep syncd
Nov 3 14:10:05.769254 sonic ERR monit[453]: 'container_checker' status failed (3) -- Expected containers not running: mux, snmp, dhcp_relay, syncd, swss, teamd, bgp
Nov 3 14:10:06.806861 sonic NOTICE python3: :- publish: EVENT_PUBLISHED: {"sonic-events-host:event-down-ctr":{"ctr_name":"syncd","timestamp":"2022-11-03T14:10:06.806710Z"}}
Nov 3 14:11:05.861950 sonic ERR monit[453]: 'container_checker' status failed (3) -- Expected containers not running: swss, teamd, dhcp_relay, mux, snmp, bgp, syncd
Nov 3 14:11:06.405897 sonic NOTICE python3: :- publish: EVENT_PUBLISHED: {"sonic-events-host:event-down-ctr":{"ctr_name":"syncd","timestamp":"2022-11-03T14:11:06.405217Z"}}
Nov 3 14:12:05.893813 sonic ERR monit[453]: 'container_checker' status failed (3) -- Expected containers not running: mux, syncd, swss, teamd, dhcp_relay, bgp, snmp
Nov 3 14:12:06.444846 sonic NOTICE python3: :- publish: EVENT_PUBLISHED: {"sonic-events-host:event-down-ctr":{"ctr_name":"syncd","timestamp":"2022-11-03T14:12:06.444652Z"}}
Nov 3 14:13:05.925553 sonic ERR monit[453]: 'container_checker' status failed (3) -- Expected containers not running: snmp, swss, teamd, syncd, bgp, mux, dhcp_relay
Nov 3 14:13:06.516344 sonic NOTICE python3: :- publish: EVENT_PUBLISHED: {"sonic-events-host:event-down-ctr":{"ctr_name":"syncd","timestamp":"2022-11-03T14:13:06.515439Z"}}
...
Yes, Looks like all the dockers are not running fine. You may want to get a "tested/stable" image from Mellanox Switch Support team. All the essential dockers are crashing.
SAI talks to syncD so technically, anything in the SAI could be the problem.
Can you check BIOS version with dmidecode? I had problems with running sonic on SN2700 but BIOS update to 2018 version solved issues (at least sonic doesn't complain now that platform is not supported)
Salutations!
We are attempting to run SoNIC on a Mellanox SN2700 switch. Several of the docker services fail to start. With my limited troubleshooting ability, I believe I have discerned that the HwSKU is not being properly detected. Other posts and discussions I have found indicate it might be old firmware to blame, but without access to an MLNX-OS
.bin
file, I can't switch over to that OS an perform a firmware update. Please correct me if I am wrong, but my understanding is that MLNX-OS is the only way to update the firmware on these devices.Is there something else wrong, perhaps? Thanks for any assistance in advance! Please let me know if there is any more information I can provide for clarity.
show techsupport
dump located here (expires Dec 3, 2022).