aristanetworks / sonic

Open source drivers and initialization library for Arista platforms running SONiC
GNU General Public License v2.0
22 stars 30 forks source link

[Chassis] Linecards init fails if the max_cores > 16 #77

Closed arlakshm closed 1 year ago

arlakshm commented 1 year ago

On Arista 7800 linecards, the sai_switch_create fails with below error, if the value max_cores in the DEVICE_METADATA > 16 if the max_cores<=16 the sai_switch_create is fine.

We should support max_cores = 64. Calculation: 16 lc 2 asic per lc 2 cores per asic

Note: This issue seems to be specific to Arista. Other platforms do not have this issue.

Jan 24 19:17:48.796007 str3-7800-lc7-1 ERR syncd#syncd: [none] SAI_API_SWITCH:platform_process_command:1036 Platform command "INIT_DNX" failed, rc = -1.
Jan 24 19:17:48.796085 str3-7800-lc7-1 INFO syncd#supervisord: syncd #015#015
Jan 24 19:17:48.796132 str3-7800-lc7-1 INFO syncd#supervisord: syncd 0:dnx_scheduler_connector_gport_add_verify:  Error 'Invalid parameter' indicated, Provided flow id is not in valid range#015
Jan 24 19:17:48.796132 str3-7800-lc7-1 INFO syncd#supervisord: syncd #015#015
Jan 24 19:17:48.796132 str3-7800-lc7-1 INFO syncd#supervisord: syncd 0:dnx_scheduler_connector_gport_add:  Error indicated (Invalid parameter) on VERIFY ; #015#015
Jan 24 19:17:48.796132 str3-7800-lc7-1 INFO syncd#supervisord: syncd #015#015
Jan 24 19:17:48.796132 str3-7800-lc7-1 INFO syncd#supervisord: syncd 0:bcm_dnx_cosq_voq_connector_gport_add:  Error 'Invalid parameter' indicated ; #015#015
Jan 24 19:17:48.796132 str3-7800-lc7-1 INFO syncd#supervisord: syncd #015#015
Jan 24 19:17:48.796132 str3-7800-lc7-1 INFO syncd#supervisord: syncd 0:appl_dnx_e2e_scheme_voq_connector_create:  Error 'Invalid parameter' indicated ; #015#015
Jan 24 19:17:48.796132 str3-7800-lc7-1 INFO syncd#supervisord: syncd #015#015
Jan 24 19:17:48.796183 str3-7800-lc7-1 INFO syncd#supervisord: syncd 0:appl_dnx_e2e_scheme_port_create:  Error 'Invalid parameter' indicated ; #015#015
Jan 24 19:17:48.796183 str3-7800-lc7-1 INFO syncd#supervisord: syncd #015#015
Jan 24 19:17:48.796183 str3-7800-lc7-1 INFO syncd#supervisord: syncd 0:appl_dnx_e2e_scheme_port_create_cb:  Error: Invalid parameter ; Failed to create e2e scheme for port: 222.#015#015
Jan 24 19:17:48.796183 str3-7800-lc7-1 INFO syncd#supervisord: syncd 0:appl_dnx_system_port_db_iterate:  Error 'Invalid parameter' indicated ; #015#015
Jan 24 19:17:48.796227 str3-7800-lc7-1 INFO syncd#supervisord: syncd #015#015
Jan 24 19:17:48.796263 str3-7800-lc7-1 INFO syncd#supervisord: syncd 0:appl_dnx_e2e_scheme_ucast_create:  Error 'Invalid parameter' indicated ; #015#015
Jan 24 19:17:48.796293 str3-7800-lc7-1 INFO syncd#supervisord: syncd #015#015
Jan 24 19:17:48.796323 str3-7800-lc7-1 INFO syncd#supervisord: syncd 0:appl_dnx_e2e_scheme_init:  Error 'Invalid parameter' indicated ; #015#015
Jan 24 19:17:48.796353 str3-7800-lc7-1 INFO syncd#supervisord: syncd #015#015
Jan 24 19:17:48.796382 str3-7800-lc7-1 INFO syncd#supervisord: syncd 0:utilex_seq_run_step_list_forward:  Error 'Invalid parameter' indicated ; #015#015
Jan 24 19:17:48.796411 str3-7800-lc7-1 INFO syncd#supervisord: syncd #015#015
Jan 24 19:17:48.796461 str3-7800-lc7-1 INFO syncd#supervisord: syncd 0:utilex_seq_run_step_list_forward:  Error 'Invalid parameter' indicated ; #015#015
Jan 24 19:17:48.796493 str3-7800-lc7-1 INFO syncd#supervisord: syncd #015#015
Jan 24 19:17:48.796523 str3-7800-lc7-1 INFO syncd#supervisord: syncd 0:utilex_seq_run:  Error 'Invalid parameter' indicated ; #015#015
Jan 24 19:17:48.796552 str3-7800-lc7-1 INFO syncd#supervisord: syncd #015#015
Jan 24 19:17:48.796581 str3-7800-lc7-1 INFO syncd#supervisord: syncd 0:appl_dnxc_steps_convert_and_run:  Error 'Invalid parameter' indicated ; #015#015
Jan 24 19:17:48.796611 str3-7800-lc7-1 INFO syncd#supervisord: syncd #015#015
Jan 24 19:17:48.796642 str3-7800-lc7-1 INFO syncd#supervisord: syncd 0:appl_dnxc_init_step_list_run:  Error 'Invalid parameter' indicated ; #015#015
Jan 24 19:17:48.796673 str3-7800-lc7-1 INFO syncd#supervisord: syncd #015#015
Jan 24 19:17:48.796702 str3-7800-lc7-1 INFO syncd#supervisord: syncd 0:sh_dnxc_init_dnx_cmd:  Error 'Invalid parameter' indicated ; #015#015
Jan 24 19:17:48.796732 str3-7800-lc7-1 INFO syncd#supervisord: syncd initialization command "INIT_DNX" failed, rv = -1 (Internal error).#015#015
Jan 24 19:17:48.836501 str3-7800-lc7-1 INFO syncd#supervisord: syncd rc: common SDK init complete#015
Jan 24 19:17:48.853397 str3-7800-lc7-1 INFO syncd#syncd: [none] SAI_API_SWITCH:brcm_sai_dnx_create_switch:6910 SAI Timing: SDK init time 21 seconds

Non-working Config

"DEVICE_METADATA": {
        "localhost": {
            "switch_type": "voq",
            "hwsku": "Arista-7800R3-48CQ2-C48",
            "synchronous_mode": "enable",
            "default_bgp_status": "down",
            "type": "SpineRouter",
            "region": "None",
            "hostname": "str2-7804-lc7-1",
            "max_cores": "64",
            "switch_id": "8",
            "platform": "x86_64-arista_7800r3_48cq2_lc",
            "mac": "fc:bd:67:67:de:5b",
            "default_pfcwd_status": "enable",
            "bgp_asn": "65100",
            "buffer_model": "traditional",
            "cloudtype": "None",
            "asic_name": "ASIC0",
            "docker_routing_config_mode": "separated",
            "deployment_id": "1"
        }
    },

Working config

"DEVICE_METADATA": {
        "localhost": {
            "switch_type": "voq",
            "hwsku": "Arista-7800R3-48CQ2-C48",
            "synchronous_mode": "enable",
            "default_bgp_status": "down",
            "type": "SpineRouter",
            "region": "None",
            "hostname": "str2-7804-lc7-1",
            "max_cores": "16",
            "switch_id": "8",
            "platform": "x86_64-arista_7800r3_48cq2_lc",
            "mac": "fc:bd:67:67:de:5b",
            "default_pfcwd_status": "enable",
            "bgp_asn": "65100",
            "buffer_model": "traditional",
            "cloudtype": "None",
            "asic_name": "ASIC0",
            "docker_routing_config_mode": "separated",
            "deployment_id": "1"
        }
    },
skbarista commented 1 year ago

From the error logs. this seems to be vendor independent resource issue for voqs. Looking whether there is soc property setting that is set in arista skus that would make it arista sku specific.

kenneth-arista commented 1 year ago

Closing as the required changes have merged.