hashicorp / nomad

Nomad is an easy-to-use, flexible, and performant workload orchestrator that can deploy a mix of microservice, batch, containerized, and non-containerized applications. Nomad is easy to operate and scale and has native Consul and Vault integrations.
https://www.nomadproject.io/
Other
14.76k stars 1.94k forks source link

csi: aws-ebs-csi plugin v1.28.0 fails to place allocations #20094

Open lgfa29 opened 6 months ago

lgfa29 commented 6 months ago

Starting in v1.28.0, the AWS EBS CSI plugin introduced a new segment key to the repose of NodeGetInfo.

When this version of the plugin is used, the Nomad client receives the following accessible topology segments.

"AccessibleTopology": {
    "Segments": {
        "topology.ebs.csi.aws.com/zone": "ca-central-1a",
        "topology.kubernetes.io/zone": "ca-central-1a"
    }
}

But dynamic volume creation does not add the topology.kubernetes.io/zone key, even if requested.

"Topologies": [
    null,
    {
        "Segments": {
            "topology.ebs.csi.aws.com/zone": "ca-central-1a"
        }
    }
],

This causes the volume to fail scheduling because the topology segments are compared with strict equality. https://github.com/hashicorp/nomad/blob/3193ac204f6564711004e00948e43146ce1399c4/nomad/structs/node.go#L63-L68

Any job that tries to use it will fail placement with a constraint violation.

==> 2024-03-07T13:17:59-05:00: Monitoring evaluation "ea69ca5b"
    2024-03-07T13:17:59-05:00: Evaluation triggered by job "mysql"
    2024-03-07T13:18:00-05:00: Evaluation within deployment: "2e1cd256"
    2024-03-07T13:18:00-05:00: Evaluation status changed: "pending" -> "complete"
==> 2024-03-07T13:18:00-05:00: Evaluation "ea69ca5b" finished with status "complete" but failed to place all allocations:
    2024-03-07T13:18:00-05:00: Task Group "mysql" (failed to place 1 allocation):
      * Constraint "did not meet topology requirement": 1 nodes excluded by filter
    2024-03-07T13:18:00-05:00: Evaluation "c59054f1" waiting for additional capacity to place remainder

I've skimmed the spec a few times, but it's not clear to me how to handle this scenario.

As pointed out in https://github.com/kubernetes-sigs/aws-ebs-csi-driver/issues/729#issuecomment-1984130759, Kubernetes ignores additional node segments, so the additional segments in the node may be safe to ignore.

Reproduction steps

  1. Create an SSH key on AWS (if you don't have one already).
  2. Clone https://gist.github.com/lgfa29/b707d56ace871602cb4955df2a1afad0
    git clone https://gist.github.com/b707d56ace871602cb4955df2a1afad0.git
  3. Enter the directory and provision the infrastructure with Terraform.
    cd b707d56ace871602cb4955df2a1afad0
    terraform init
    terraform apply 
  4. Enter your SSH key name and approve the plan.
  5. Run the mysql.nomad.hcl job.
    NOMAD_ADDR=$(terraform output -raw nomad_addr) nomad run mysql.nomad.hcl

Expected Result

Job starts successfully.

Actual Result

Job fails placement with constraint error:

Constraint "did not meet topology requirement": 1 nodes excluded by filter
magec commented 3 months ago

Had the same issue, downgrading to 1.27.0 fixed it for me.

Vigenere36 commented 2 months ago

Wondering timeline on this - is a fix as simple as doing an intersection of the two maps? Not crucial, but is blocking our team from upgrading our csi driver.