openshift / sriov-network-operator

SR-IOV Network Operator
Apache License 2.0
117 stars 106 forks source link

Unsupported value: "requests.hugepages-1Gi" #514

Closed rupang790 closed 2 years ago

rupang790 commented 3 years ago

Hi, I would like to use SR-IOV network with HugePages but no with DPDK. My environment is:

I followed to configure HugePage on Node, then create SR-IOV NetworkAttachmentDefinition and pod as below:

SriovNetworkNodePolicy

apiVersion: sriovnetwork.openshift.io/v1
kind: SriovNetworkNodePolicy
metadata:
  name: ens3f1-netdev
  namespace: openshift-sriov-network-operator
spec:
  nodeSelector:
    feature.node.kubernetes.io/network-sriov.capable: "true"
    kubernetes.io/hostname: worker01.eluon.okd.com
  resourceName: ens3f1net
  numVfs: 4
  nicSelector:
    deviceID: "10fb"
    rootDevices: ["0000:13:00.1"]
    vendor: "8086"
    pfNames: ["ens3f1#0-3"]
  deviceType: netdevice

SriovNetwork

apiVersion: sriovnetwork.openshift.io/v1
kind: SriovNetwork
metadata:
  name: ens3f1-netdev
  namespace: openshift-sriov-network-operator
spec:
  resourceName: ens3f1net
  networkNamespace: default
  ipam: '{
    "type": "static",
    "addresses": [
    {
      "address": "192.168.10.30/24"
    }],
    "routes": [
    {
      "dst": "192.168.10.0/24",
      "gw": "192.168.10.100"
    }]
  }'
  capabilities: '{ "ips": true }'

Pod

apiVersion: v1
kind: Pod
metadata:
  name: pod-sriov-hp-test
  namespace: default
  annotations:
    k8s.v1.cni.cncf.io/networks: default/ens3f1-netdev
spec:
  nodeSelector:
    kubernetes.io/hostname: worker01.eluon.okd.com
  hostname: test-sriov-hp
  containers:
  - name: test-container
    command: ["/bin/bash", "-c", "sleep 200000000"]
    image: centos/tools:latest
    resources:
      limits:
        openshift.io/ens3f1net: "1"
        hugepages-1Gi: "1Gi"
        memory: 2Gi
      requests:
        openshift.io/ens3f1net: "1"
        hugepages-1Gi: "1Gi"
        memory: 2Gi
    securityContext:
      privileged: true
    volumeMounts:
      - name: hugepages
        mountPath: /dev/hugepages
  volumes:
  - name: hugepages
    hostPath:
      path: /dev/hugepages

But error as below occurred:

The Pod "pod-sriov-hp-test" is invalid:
* spec.volumes[2].downwardAPI.resourceFieldRef.resource: Unsupported value: "requests.hugepages-1Gi": supported values: "limits.cpu", "limits.ephemeral-storage", "limits.memory", "requests.cpu", "requests.ephemeral-storage", "requests.memory"
* spec.volumes[2].downwardAPI.resourceFieldRef.resource: Unsupported value: "limits.hugepages-1Gi": supported values: "limits.cpu", "limits.ephemeral-storage", "limits.memory", "requests.cpu", "requests.ephemeral-storage", "requests.memory"
* spec.containers[0].volumeMounts[2].name: Not found: "podnetinfo"

What I test for this:

I already check SR-IOV which use DPDK is working well with HugePage configuration, but it is not working for SR-IOV which is not use DPDK case. Do you have any information about this? (I am not sure this error is related to SR-IOV Operator or OKD Cluster.)

openshift-bot commented 3 years ago

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close. Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

openshift-bot commented 2 years ago

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten. Rotten issues close after an additional 30d of inactivity. Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten /remove-lifecycle stale

zshi-redhat commented 2 years ago

I think the root cause is the k8s cluster in your deployment doesn't support the hugepage downward API, but the network-resources-injector injects the hugepage downward API to sriov pod automatically.

The solution could be either upgrade the k8s cluster to higher version 1.22 or disable the network-resources-injector.

openshift-bot commented 2 years ago

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen. Mark the issue as fresh by commenting /remove-lifecycle rotten. Exclude this issue from closing again by commenting /lifecycle frozen.

/close

openshift-ci[bot] commented 2 years ago

@openshift-bot: Closing this issue.

In response to [this](https://github.com/openshift/sriov-network-operator/issues/514#issuecomment-962781028): >Rotten issues close after 30d of inactivity. > >Reopen the issue by commenting `/reopen`. >Mark the issue as fresh by commenting `/remove-lifecycle rotten`. >Exclude this issue from closing again by commenting `/lifecycle frozen`. > >/close Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.