cosmonic-labs / netreap

A Cilium controller implementation for Nomad
https://netreap.io
Apache License 2.0
130 stars 8 forks source link

Netreap dont be reapplying the labels #20

Closed iamredbull closed 1 year ago

iamredbull commented 1 year ago

Before host restart:

image image image

After host restart:

Netreap dont be reapplying the labels after restart host. image image image

Netreap debug logs:

2023-07-14T12:21:38.298Z    DEBUG   netreap/main.go:124 Starting node reaper
2023-07-14T12:21:38.298Z    DEBUG   reapers/nodes.go:107    Beginning reconciliation
2023-07-14T12:21:38.298Z    DEBUG   reapers/nodes.go:108    Getting nomad node list
2023-07-14T12:21:38.303Z    DEBUG   reapers/nodes.go:119    Finished constructing list of all nodesnodesmap[ax51-host131:{} cn6-host48:{} cpx31-host58:{}]
2023-07-14T12:21:38.303Z    DEBUG   reapers/nodes.go:121    Fetching cilium nodes from consul
2023-07-14T12:21:38.308Z    DEBUG   netreap/main.go:135 Starting endpoint reaper
2023-07-14T12:21:38.308Z    DEBUG   reapers/endpoints.go:155    Starting reconciliation
2023-07-14T12:21:38.310Z    DEBUG   reapers/endpoints.go:169    Finished fetching service list, constructing set of IP addresses from servicesservice_list[{nomad-clients} {nomad-servers} {consul} {netreap}]
2023-07-14T12:21:38.312Z    INFO    reapers/nodes.go:56 Waiting for leader election
2023-07-14T12:21:38.318Z    DEBUG   reapers/endpoints.go:203    Finished generating current IP list. Fetching endpoints from ciliumip_listmap[]
2023-07-14T12:21:38.320Z    DEBUG   reapers/endpoints.go:211    Checking all endpoints
2023-07-14T12:21:38.320Z    DEBUG   reapers/endpoints.go:219    Endpoint is not an init service, skipping   {"labels": ["reserved:host"]}
2023-07-14T12:21:38.320Z    DEBUG   reapers/endpoints.go:219    Endpoint is not an init service, skipping   {"labels": ["reserved:health"]}
2023-07-14T12:21:38.320Z    DEBUG   reapers/endpoints.go:265    Finished reconciliationnum_errors0
2023-07-14T12:21:38.324Z    DEBUG   netreap/main.go:146 starting policy poller
2023-07-14T12:21:38.324Z    INFO    policy_poller   policy/policy.go:41 starting Consul watch for key: netreap.io/policy
2023-07-14T12:21:38.326Z    INFO    policy_poller   policy/policy.go:98 loaded new policy
2023-07-14T12:21:38.330Z    DEBUG   reapers/endpoints.go:93 Got 21 job events. Handling...
2023-07-14T12:21:38.330Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:38.330Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:38.330Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:38.330Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:38.330Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:38.330Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:38.330Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:38.330Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:38.330Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:38.330Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:38.330Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:38.330Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:38.330Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:38.330Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:38.330Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:38.330Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:38.330Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:38.330Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:38.330Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:38.330Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:38.330Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:38.413Z    DEBUG   reapers/endpoints.go:93 Got 2 job events. Handling...
2023-07-14T12:21:38.413Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:38.413Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:38.698Z    DEBUG   reapers/endpoints.go:93 Got 2 job events. Handling...
2023-07-14T12:21:38.698Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:38.698Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:38.914Z    DEBUG   reapers/endpoints.go:93 Got 3 job events. Handling...
2023-07-14T12:21:38.914Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:38.914Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:38.914Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:39.174Z    DEBUG   reapers/endpoints.go:93 Got 3 job events. Handling...
2023-07-14T12:21:39.174Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:39.174Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:39.174Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:39.467Z    DEBUG   reapers/endpoints.go:93 Got 3 job events. Handling...
2023-07-14T12:21:39.467Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:39.467Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:39.467Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:39.713Z    DEBUG   reapers/endpoints.go:93 Got 2 job events. Handling...
2023-07-14T12:21:39.713Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:39.713Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:39.977Z    DEBUG   reapers/endpoints.go:93 Got 4 job events. Handling...
2023-07-14T12:21:39.977Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:39.977Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:39.977Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:39.977Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:40.600Z    DEBUG   reapers/endpoints.go:93 Got 2 job events. Handling...
2023-07-14T12:21:40.600Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:40.600Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:44.210Z    DEBUG   reapers/endpoints.go:93 Got 1 job events. Handling...
2023-07-14T12:21:44.210Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of NodeRegistration
2023-07-14T12:21:50.058Z    DEBUG   reapers/endpoints.go:93 Got 4 job events. Handling...
2023-07-14T12:21:50.058Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:50.058Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:50.058Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:50.058Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:50.737Z    DEBUG   reapers/endpoints.go:93 Got 2 job events. Handling...
2023-07-14T12:21:50.737Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:50.737Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated

In some cases, some jobs re-tagged, but not all: image

In order for jobs to get tags again, and sometimes ip, you need to stop & start the job again:

Before restart job: image After restart job: image image Netreap logs: image

Cilium & Neatreap deployed from this guide https://cosmonic.com/blog/engineering/netreap-a-practical-guide-to-running-cilium-in-nomad. I think that this behavior of netreap is not entirely correct. Please tell me what is the reason for this behavior and how can I fix it? @deverton @protochron

Cilium - v1.13.4 Netreap - v0.1.0

iamredbull commented 1 year ago

I noticed such a moment, when starting a nomad-job with several groups, only one of the groups receives the label. Nomad-job:

job "example-job" {

  datacenters = ["dc1"]
  namespace = "dedicated"

  constraint {
     attribute = "${attr.unique.consul.name}"
     operator  = "="
     value     = "cn6-host48"
  }

  meta = {
    "example.com/app_name" = "service-echo"
  }

  group "http-echo-group" {
    network {

      mode = "cni/cilium"

      dns {
        servers = ["172.17.0.1"]
      }

    }

    restart {
        attempts = 3
        interval = "15m"
        delay = "20s"
        mode = "fail"
    }   

    service {
      name         = "http-echo"
      port         = "80"
      tags         = ["http-echo"]
      address_mode = "alloc"
    }

    task "http-echo" {
      driver = "docker"

      config {
        image  = "hashicorp/http-echo"
          args = [
            "--text=hello world",
            "--listen=:80"
          ]
        auth_soft_fail = true
      }

      resources {
        cpu    = 500
        memory = 256
      }
    }
  }

  group "network-multitool-group" {
    network {
      dns {
        servers = ["172.17.0.1"]
      }
      mode = "cni/cilium"
    }

    restart {
        attempts = 3
        interval = "15m"
        delay = "20s"
        mode = "fail"
    }   

    service {
      name         = "network-multitool"
      port         = "80"
      tags         = ["network-multitool"]
      address_mode = "alloc"
    }

    task "network-multitool" {
      driver = "docker"
      config {
        image          = "wbitt/network-multitool"

        auth_soft_fail = true
      }

      resources {
        cpu    = 500
        memory = 256
      }
    }
  }
}

Cilium endpoint list:

ENDPOINT   POLICY (ingress)   POLICY (egress)   IDENTITY   LABELS (source:key[=value])               IPv6   IPv4            STATUS   
           ENFORCEMENT        ENFORCEMENT                                                                                   
141        Enabled            Enabled           4          reserved:health                                  172.16.171.94   ready   
1418       Disabled           Disabled          1          reserved:host                                                    ready   
1535       Enabled            Enabled           5          reserved:init                                    172.16.6.61     ready   
2641       Enabled            Enabled           28939      netreap:nomad.job_id=example-job                 172.16.44.243   ready   
                                                           netreap:nomad.namespace=dedicated                                        
                                                           nomad:example.com/app_name=service-echo                                  
                                                           reserved:init                                                            

Netreap-job logs:

2023-08-01T09:58:39.847Z    DEBUG   netreap/main.go:124 Starting node reaper
2023-08-01T09:58:39.847Z    DEBUG   reapers/nodes.go:107    Beginning reconciliation
2023-08-01T09:58:39.847Z    DEBUG   reapers/nodes.go:108    Getting nomad node list
2023-08-01T09:58:39.865Z    DEBUG   reapers/nodes.go:119    Finished constructing list of all nodes {"nodes": {"cn6-host48":{},"cpx31-host58":{}}}
2023-08-01T09:58:39.866Z    DEBUG   reapers/nodes.go:121    Fetching cilium nodes from consul
2023-08-01T09:58:39.902Z    DEBUG   netreap/main.go:135 Starting endpoint reaper
2023-08-01T09:58:39.902Z    DEBUG   reapers/endpoints.go:155    Starting reconciliation
2023-08-01T09:58:39.911Z    DEBUG   reapers/endpoints.go:169    Finished fetching service list, constructing set of IP addresses from servicesservice_list[{consul} {netreap} {nomad-clients} {nomad-servers}]
2023-08-01T09:58:39.918Z    INFO    reapers/nodes.go:56 Waiting for leader election
2023-08-01T09:58:39.945Z    DEBUG   reapers/endpoints.go:203    Finished generating current IP list. Fetching endpoints from cilium {"ip_list": {}}
2023-08-01T09:58:39.949Z    DEBUG   reapers/endpoints.go:211    Checking all endpoints
2023-08-01T09:58:39.949Z    DEBUG   reapers/endpoints.go:219    Endpoint is not an init service, skipping   {"labels": ["reserved:health"]}
2023-08-01T09:58:39.949Z    DEBUG   reapers/endpoints.go:219    Endpoint is not an init service, skipping   {"labels": ["reserved:host"]}
2023-08-01T09:58:39.949Z    DEBUG   reapers/endpoints.go:265    Finished reconciliation {"num_errors": 0}
2023-08-01T09:58:39.982Z    DEBUG   netreap/main.go:146 starting policy poller
2023-08-01T09:58:39.983Z    INFO    policy_poller   policy/policy.go:41 starting Consul watch for key: netreap.io/policy
2023-08-01T09:58:39.988Z    DEBUG   reapers/endpoints.go:93 Got 2 job events. Handling...
2023-08-01T09:58:39.988Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-08-01T09:58:39.988Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-08-01T09:58:39.994Z    INFO    policy_poller   policy/policy.go:98 loaded new policy
2023-08-01T09:58:40.261Z    DEBUG   reapers/endpoints.go:93 Got 3 job events. Handling...
2023-08-01T09:58:40.261Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-08-01T09:58:40.261Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-08-01T09:58:40.261Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-08-01T09:59:30.969Z    DEBUG   elector/mod.go:108  Unable to acquire lock. Retrying up to 6 times
2023-08-01T09:59:33.305Z    DEBUG   reapers/endpoints.go:93 Got 1 job events. Handling...
2023-08-01T09:59:33.307Z    DEBUG   reapers/endpoints.go:416    Job was empty   {"event_type": "JobDeregistered"}
2023-08-01T09:59:33.384Z    DEBUG   reapers/endpoints.go:93 Got 1 job events. Handling...
2023-08-01T09:59:33.384Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of EvaluationUpdated
2023-08-01T09:59:40.982Z    DEBUG   elector/mod.go:115  Lock retry 1 did not succeed
2023-08-01T09:59:49.182Z    DEBUG   reapers/endpoints.go:93 Got 1 job events. Handling...
2023-08-01T09:59:49.182Z    DEBUG   reapers/endpoints.go:93 Got 2 job events. Handling...
2023-08-01T09:59:49.183Z    DEBUG   reapers/endpoints.go:416    Job was empty   {"event_type": "JobRegistered"}
2023-08-01T09:59:49.209Z    DEBUG   reapers/endpoints.go:327    Fetching services from consul for job   {"job_id": "example-job", "retry_num": 1}
2023-08-01T09:59:49.210Z    DEBUG   reapers/endpoints.go:327    Fetching services from consul for job   {"job_id": "example-job", "retry_num": 1}
2023-08-01T09:59:49.218Z    DEBUG   reapers/endpoints.go:334    Did not find a ready service in consul  {"job_id": "example-job", "retry_num": 1}
2023-08-01T09:59:49.218Z    DEBUG   reapers/endpoints.go:334    Did not find a ready service in consul  {"job_id": "example-job", "retry_num": 1}
2023-08-01T09:59:49.483Z    DEBUG   reapers/endpoints.go:93 Got 5 job events. Handling...
2023-08-01T09:59:49.483Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of PlanResult
2023-08-01T09:59:49.483Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of PlanResult
2023-08-01T09:59:49.483Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of PlanResult
2023-08-01T09:59:49.483Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of PlanResult
2023-08-01T09:59:49.483Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of PlanResult
2023-08-01T09:59:49.536Z    DEBUG   reapers/endpoints.go:93 Got 1 job events. Handling...
2023-08-01T09:59:49.536Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of EvaluationUpdated
2023-08-01T09:59:50.295Z    DEBUG   reapers/endpoints.go:93 Got 2 job events. Handling...
2023-08-01T09:59:50.295Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-08-01T09:59:50.295Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-08-01T09:59:50.600Z    DEBUG   reapers/endpoints.go:93 Got 2 job events. Handling...
2023-08-01T09:59:50.600Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-08-01T09:59:50.600Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-08-01T09:59:50.993Z    DEBUG   elector/mod.go:115  Lock retry 2 did not succeed
2023-08-01T09:59:51.218Z    DEBUG   reapers/endpoints.go:327    Fetching services from consul for job   {"job_id": "example-job", "retry_num": 2}
2023-08-01T09:59:51.218Z    DEBUG   reapers/endpoints.go:327    Fetching services from consul for job   {"job_id": "example-job", "retry_num": 2}
2023-08-01T09:59:51.228Z    DEBUG   reapers/endpoints.go:344    Found services for new jobjob_idexample-job
2023-08-01T09:59:51.228Z    DEBUG   reapers/endpoints.go:356    Finding related cilium endpoint for job {"job_id": "example-job"}
2023-08-01T09:59:51.228Z    DEBUG   reapers/endpoints.go:344    Found services for new jobjob_idexample-job
2023-08-01T09:59:51.228Z    DEBUG   reapers/endpoints.go:356    Finding related cilium endpoint for job {"job_id": "example-job"}
2023-08-01T09:59:51.840Z    DEBUG   reapers/endpoints.go:93 Got 3 job events. Handling...
2023-08-01T09:59:51.840Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-08-01T09:59:51.840Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-08-01T09:59:51.840Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-08-01T10:00:01.033Z    DEBUG   elector/mod.go:115  Lock retry 3 did not succeed
2023-08-01T10:00:01.751Z    DEBUG   reapers/endpoints.go:93 Got 3 job events. Handling...
2023-08-01T10:00:01.751Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-08-01T10:00:01.751Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-08-01T10:00:01.751Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-08-01T10:00:01.998Z    DEBUG   reapers/endpoints.go:93 Got 3 job events. Handling...
2023-08-01T10:00:01.998Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-08-01T10:00:01.998Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-08-01T10:00:01.998Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-08-01T10:00:02.986Z    DEBUG   reapers/endpoints.go:93 Got 1 job events. Handling...
2023-08-01T10:00:02.986Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdateDesiredStatus
2023-08-01T10:00:03.242Z    DEBUG   reapers/endpoints.go:93 Got 3 job events. Handling...
2023-08-01T10:00:03.242Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of PlanResult
2023-08-01T10:00:03.242Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of PlanResult
2023-08-01T10:00:03.242Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of PlanResult
2023-08-01T10:00:03.334Z    DEBUG   reapers/endpoints.go:93 Got 1 job events. Handling...
2023-08-01T10:00:03.334Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of EvaluationUpdated
2023-08-01T10:00:11.044Z    DEBUG   elector/mod.go:115  Lock retry 4 did not succeed
2023-08-01T10:00:21.077Z    DEBUG   elector/mod.go:115  Lock retry 5 did not succeed
2023-08-01T10:00:31.097Z    DEBUG   elector/mod.go:115  Lock retry 6 did not succeed
2023-08-01T10:00:31.097Z    DEBUG   elector/mod.go:117  Never acquired lock after retry

The second group remains in the init state.

But if I restart Netreap in the cluster, both groups will immediately get the label. Cilium endpoint list:

ENDPOINT   POLICY (ingress)   POLICY (egress)   IDENTITY   LABELS (source:key[=value])               IPv6   IPv4            STATUS   
           ENFORCEMENT        ENFORCEMENT                                                                                   
141        Enabled            Enabled           4          reserved:health                                  172.16.171.94   ready   
1418       Disabled           Disabled          1          reserved:host                                                    ready   
1535       Enabled            Enabled           28939      netreap:nomad.job_id=example-job                 172.16.6.61     ready   
                                                           netreap:nomad.namespace=dedicated                                        
                                                           nomad:example.com/app_name=service-echo                                  
                                                           reserved:init                                                            
2641       Enabled            Enabled           28939      netreap:nomad.job_id=example-job                 172.16.44.243   ready   
                                                           netreap:nomad.namespace=dedicated                                        
                                                           nomad:example.com/app_name=service-echo                                  
                                                           reserved:init                                                            

Netreap-job logs:

2023-08-01T10:05:21.560Z    DEBUG   netreap/main.go:124 Starting node reaper
2023-08-01T10:05:21.561Z    DEBUG   reapers/nodes.go:107    Beginning reconciliation
2023-08-01T10:05:21.561Z    DEBUG   reapers/nodes.go:108    Getting nomad node list
2023-08-01T10:05:21.578Z    DEBUG   reapers/nodes.go:119    Finished constructing list of all nodes {"nodes": {"cn6-host48":{},"cpx31-host58":{}}}
2023-08-01T10:05:21.578Z    DEBUG   reapers/nodes.go:121    Fetching cilium nodes from consul
2023-08-01T10:05:21.617Z    DEBUG   netreap/main.go:135 Starting endpoint reaper
2023-08-01T10:05:21.618Z    DEBUG   reapers/endpoints.go:155    Starting reconciliation
2023-08-01T10:05:21.626Z    DEBUG   reapers/endpoints.go:169    Finished fetching service list, constructing set of IP addresses from servicesservice_list[{network-multitool} {nomad-clients} {nomad-servers} {consul} {http-echo} {netreap}]
2023-08-01T10:05:21.628Z    INFO    reapers/nodes.go:56 Waiting for leader election
2023-08-01T10:05:21.674Z    DEBUG   reapers/endpoints.go:203    Finished generating current IP list. Fetching endpoints from cilium {"ip_list": {"172.16.212.128":{"ID":"df8a0bec-b718-d91e-9f8d-0e5ef3b7e077","Namespace":""},"172.16.242.70":{"ID":"52777fb2-ac22-749a-f709-57a5ecddb881","Namespace":""}}}
2023-08-01T10:05:21.680Z    DEBUG   reapers/endpoints.go:211    Checking all endpoints
2023-08-01T10:05:21.680Z    DEBUG   reapers/endpoints.go:219    Endpoint is not an init service, skipping   {"labels": ["netreap:nomad.job_id=example-job","netreap:nomad.namespace=dedicated","nomad:example.com/app_name=service-echo"]}
2023-08-01T10:05:21.680Z    DEBUG   reapers/endpoints.go:219    Endpoint is not an init service, skipping   {"labels": ["reserved:host"]}
2023-08-01T10:05:21.680Z    DEBUG   reapers/endpoints.go:219    Endpoint is not an init service, skipping   {"labels": ["reserved:health"]}
2023-08-01T10:05:21.680Z    DEBUG   reapers/endpoints.go:222    Checking if endpoint still exists   {"endpoint_id": 1500}
2023-08-01T10:05:21.680Z    DEBUG   reapers/endpoints.go:227    Got ip  {"ip": {"ipv4":"172.16.212.128"}}
2023-08-01T10:05:21.680Z    DEBUG   reapers/endpoints.go:250    Found an endpoint missing labels. Updating with current job labels  {"endpoint_id": 1500}
2023-08-01T10:05:21.705Z    DEBUG   reapers/endpoints.go:265    Finished reconciliation {"num_errors": 0}
2023-08-01T10:05:21.740Z    DEBUG   netreap/main.go:146 starting policy poller
2023-08-01T10:05:21.740Z    INFO    policy_poller   policy/policy.go:41 starting Consul watch for key: netreap.io/policy
2023-08-01T10:05:21.746Z    DEBUG   reapers/endpoints.go:93 Got 2 job events. Handling...
2023-08-01T10:05:21.746Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-08-01T10:05:21.747Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-08-01T10:05:21.752Z    INFO    policy_poller   policy/policy.go:98 loaded new policy
2023-08-01T10:05:21.753Z    DEBUG   reapers/endpoints.go:93 Got 2 job events. Handling...
2023-08-01T10:05:21.753Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-08-01T10:05:21.753Z    DEBUG   reapers/endpoints.go:104    Ignoring Job event with type of AllocationUpdated
2023-08-01T10:06:09.807Z    DEBUG   elector/mod.go:108  Unable to acquire lock. Retrying up to 6 times
2023-08-01T10:06:19.817Z    DEBUG   elector/mod.go:115  Lock retry 1 did not succeed
2023-08-01T10:06:29.831Z    DEBUG   elector/mod.go:115  Lock retry 2 did not succeed
2023-08-01T10:06:39.847Z    DEBUG   elector/mod.go:115  Lock retry 3 did not succeed
2023-08-01T10:06:49.879Z    DEBUG   elector/mod.go:115  Lock retry 4 did not succeed
2023-08-01T10:06:59.898Z    DEBUG   elector/mod.go:115  Lock retry 5 did not succeed

Maybe this is a bug or am I doing something wrong? In my cases, nomad jobs most often consist of several groups. Please take a look @deverton @protochron

Netreap - 0.1.2 also 0.1.0 Cilium - 1.13.4 Nomad - v1.5.6 Consul - v1.14.7