Environment:

Provider Version: 0.4.6
Akash Version: 0.26.1

Issue Summary:

The provider, despite supporting the correct GPU model and bidding accordingly, erroneously sets an unsupported GPU model when forming the order request. This error occurs because the provider defaults to the last (dict sorted) GPU model listed in the SDL, which may not be supported or may even be non-existent.

This leads to the bid price script calculating bids based on this incorrect GPU model, resulting in either inaccurate bids or a failure to bid if the provider has not set pricing for this model.

Steps to Reproduce:

Have a provider with some GPU (e.g., a100).
Create an SDL file listing multiple GPU models, placing a non-existent or random models (e.g., - model: akgjkajgksag) and the supported model (a100) further down the list.
Broadcast the SDL to initiate bidding from the provider.
Review the order request and observe that it incorrectly specifies the GPU model from the SDL found last (after dict sorting), e.g., "model": "akgjkajgksag", not the supported a100.
Notice that the bid price script fails to calculate a price due to the absence of pricing for the non-existent model akgjkajgksag.

Expected Behavior:

The provider should identify and select the GPU model it actually supports when forming the order request. This correct model should then be used by the bid price script for price calculation, ignoring any models that are not supported.

Actual Behavior:

The provider incorrectly selects the last (dict sorted) GPU model listed in the SDL for the order request. This misstep leads to the bid price script either not calculating a price or calculating an incorrect price, as it encounters an unsupported or non-existent GPU model.

Example

Provider attributes: supported GPU - `a100`

$ provider-services query provider get akash1c6rsz4f59nkus3s5qauxxh969j2mtkkn2clk2e -o text
attributes:
...
- key: capabilities/gpu/vendor/nvidia/model/a100
  value: "true"

SDL Contents:

Notice, v100 model here would be the last model when dict (alphabetically) sorted. And a100 is also part of the list so that provider with a100 bids on it.

        gpu:
          units: 1
          attributes:
            vendor:
              nvidia:
                - model: v100
                - model: h100
                - model: a100
                - model: a40
                - model: a16
                - model: t4
                - model: rtx5000
                - model: rtx6000
                - model: a4000
                - model: a5000
                - model: a6000
                - model: 3090
                - model: 3090ti
                - model: 4090

The deployment order Provider forms (before passing it to the bid price script):

As demonstrated, the received order request incorrectly specifies the v100 model (which would be the last when dict sorted from the SDL models list) instead of the a100 model that the provider supports.

{
  "resources": [
    {
      "memory": 107374182400,
      "cpu": 8000,
      "gpu": {
        "units": 1,
        "attributes": {
          "vendor": {
            "nvidia": {
              "model": "v100"
            }
          }
        }
      },
      "storage": [
        {
          "class": "ephemeral",
          "size": 214748364800
        }
      ],
      "count": 1,
      "endpoint_quantity": 1,
      "ip_lease_quantity": 0
    }
  ],
  "price": {
    "denom": "uakt",
    "amount": "100000.000000000000000000"
  },
  "price_precision": 6
}

Additional information

The model provider picks is the last model after dict (alphabetically) sorted.

        gpu:
          units: 1
          attributes:
            vendor:
              nvidia:
                - model: rtx4000
                - model: a1
                - model: a11
                - model: b1
                - model: b11
                - model: z
                - model: z1
                - model: z11
                - model: zz1
                - model: zzz1
                - model: zzz11
                - model: y
                - model: yy
                - model: yyy
                - model: yyy0
                - model: yyyy
                - model: yyyy0
                - model: zzz0
                - model: zzz
                - model: 1
                - model: 11
                - model: 9
                - model: 99999

root@akash-provider-0:/tmp# grep -C3 model akash1nx9pr8jee9jx44tkgt62fmgt2hmgvru92td3hg.log
        "attributes": {
          "vendor": {
            "nvidia": {
              "model": "zzz11"
            }
          }
        }

dict (alphabetical) sorting:

$ cat m | sort -d
1
11
9
99999
a1
a11
b1
b11
rtx4000
y
yy
yyy
yyy0
yyyy
yyyy0
z
z1
z11
zz1
zzz
zzz0
zzz1
zzz11

akash-network / support

provider incorrectly defaults to the last (dict sorted) GPU model in the SDL model list when forming order request before handing it to the bid price script #139

Environment:

Issue Summary:

Steps to Reproduce:

Expected Behavior:

Actual Behavior:

Example

Provider attributes: supported GPU - `a100`

SDL Contents:

The deployment order Provider forms (before passing it to the bid price script):

Additional information

Partial workaround

akash-network / support

provider incorrectly defaults to the last (dict sorted) GPU model in the SDL model list when forming order request before handing it to the bid price script #139

Environment:

Issue Summary:

Steps to Reproduce:

Expected Behavior:

Actual Behavior:

Example

Provider attributes: supported GPU - a100

SDL Contents:

The deployment order Provider forms (before passing it to the bid price script):

Additional information

Partial workaround

Provider attributes: supported GPU - `a100`