Open soupdiver opened 1 year ago
@soupdiver
Thank you for the information. Can you provide the following:
[1] nomad job status <job name>
[2] Resource code for your job for testing on my end
Thanks!
1
nomad job status home-heimdall
ID = home-heimdall
Name = home-heimdall
Submit Date = 2023-04-07T21:27:47+02:00
Type = service
Priority = 50
Datacenters = home
Namespace = default
Status = running
Periodic = false
Parameterized = false
Summary
Task Group Queued Starting Running Failed Complete Lost Unknown
heimdall 0 0 1 1 17 2 0
Allocations
ID Node ID Task Group Version Desired Status Created Modified
5abcf919 41503e45 heimdall 3 run running 4d10h ago 1d16h ago
2.
{
"Stop": false,
"Region": "global",
"Namespace": "default",
"ID": "home-heimdall",
"ParentID": "",
"Name": "home-heimdall",
"Type": "service",
"Priority": 50,
"AllAtOnce": false,
"Datacenters": [
"home"
],
"Constraints": null,
"Affinities": null,
"Spreads": null,
"TaskGroups": [
{
"Name": "heimdall",
"Count": 1,
"Update": {
"Stagger": 30000000000,
"MaxParallel": 1,
"HealthCheck": "checks",
"MinHealthyTime": 10000000000,
"HealthyDeadline": 300000000000,
"ProgressDeadline": 600000000000,
"AutoRevert": false,
"AutoPromote": false,
"Canary": 0
},
"Migrate": {
"MaxParallel": 1,
"HealthCheck": "checks",
"MinHealthyTime": 10000000000,
"HealthyDeadline": 300000000000
},
"Constraints": [
{
"LTarget": "${attr.consul.version}",
"RTarget": ">= 1.7.0",
"Operand": "semver"
}
],
"Scaling": null,
"RestartPolicy": {
"Attempts": 2,
"Interval": 1800000000000,
"Delay": 15000000000,
"Mode": "fail"
},
"Tasks": [
{
"Name": "heimdall",
"Driver": "docker",
"User": "",
"Config": {
"force_pull": true,
"volumes": [
"heimdall:/config"
],
"volume_driver": "local",
"image": "linuxserver/heimdall",
"ports": [
"http"
]
},
"Env": {
"PUID": "1000",
"PGID": "1000",
"TZ": "Europe/Berlin"
},
"Services": [
{
"Name": "home-heimdall-heimdall-heimdall",
"TaskName": "heimdall",
"PortLabel": "http",
"AddressMode": "auto",
"Address": "",
"EnableTagOverride": false,
"Tags": [
"traefik.enable=true",
"traefik.http.routers.heimdall.entryPoints=web",
"traefik.http.routers.heimdall.rule=Host(`home-heimdall-heimdall-heimdall.service.consul`)",
"dc=home"
],
"CanaryTags": null,
"Checks": null,
"Connect": null,
"Meta": null,
"CanaryMeta": null,
"TaggedAddresses": null,
"Namespace": "default",
"OnUpdate": "require_healthy",
"Provider": "consul"
}
],
"Vault": null,
"Templates": null,
"Constraints": null,
"Affinities": null,
"Resources": {
"CPU": 200,
"Cores": 0,
"MemoryMB": 512,
"MemoryMaxMB": 0,
"DiskMB": 0,
"IOPS": 0,
"Networks": null,
"Devices": null
},
"RestartPolicy": {
"Attempts": 2,
"Interval": 1800000000000,
"Delay": 15000000000,
"Mode": "fail"
},
"DispatchPayload": null,
"Lifecycle": null,
"Meta": null,
"KillTimeout": 5000000000,
"LogConfig": {
"MaxFiles": 10,
"MaxFileSizeMB": 10
},
"Artifacts": null,
"Leader": false,
"ShutdownDelay": 0,
"VolumeMounts": null,
"ScalingPolicies": null,
"KillSignal": "",
"Kind": "",
"CSIPluginConfig": null,
"Identity": null
}
],
"EphemeralDisk": {
"Sticky": false,
"SizeMB": 300,
"Migrate": false
},
"Meta": null,
"ReschedulePolicy": {
"Attempts": 0,
"Interval": 0,
"Delay": 30000000000,
"DelayFunction": "exponential",
"MaxDelay": 3600000000000,
"Unlimited": true
},
"Affinities": null,
"Spreads": null,
"Networks": [
{
"Mode": "",
"Device": "",
"CIDR": "",
"IP": "",
"Hostname": "",
"MBits": 0,
"DNS": null,
"ReservedPorts": [
{
"Label": "http",
"Value": 11000,
"To": 80,
"HostNetwork": "default"
},
{
"Label": "https",
"Value": 1001,
"To": 443,
"HostNetwork": "default"
}
],
"DynamicPorts": null
}
],
"Consul": {
"Namespace": ""
},
"Services": null,
"Volumes": null,
"ShutdownDelay": null,
"StopAfterClientDisconnect": null,
"MaxClientDisconnect": null
}
],
"Update": {
"Stagger": 30000000000,
"MaxParallel": 1,
"HealthCheck": "",
"MinHealthyTime": 0,
"HealthyDeadline": 0,
"ProgressDeadline": 0,
"AutoRevert": false,
"AutoPromote": false,
"Canary": 0
},
"Multiregion": null,
"Periodic": null,
"ParameterizedJob": null,
"Dispatched": false,
"DispatchIdempotencyToken": "",
"Payload": null,
"Meta": null,
"ConsulToken": "",
"ConsulNamespace": "",
"VaultToken": "",
"VaultNamespace": "",
"NomadTokenID": "",
"Status": "running",
"StatusDescription": "",
"Stable": true,
"Version": 3,
"SubmitTime": 1680895667926135300,
"CreateIndex": 105,
"ModifyIndex": 164527,
"JobModifyIndex": 153912
}
@soupdiver
Thank you for the additional information. I believe you may be running into this issue here: https://github.com/portainer/portainer/issues/8369#issuecomment-1404470807
How are you creating your Jobs? The aforementioned are being created via Terraform. I do have an existing internal issue logged. This should be resolved in an upcoming major release.
Thanks!
@soupdiver
Thank you for the additional information. I believe you may be running into this issue here: https://github.com/portainer/portainer/issues/8369#issuecomment-1404470807
How are you creating your Jobs? The aforementioned are being created via Terraform. I do have an existing internal issue logged. This should be resolved in an upcoming major release.
Thanks!
My jobs are created by simply pasting the job spec into the web ui of Nomad.
@soupdiver
Thank you for the additional information.
Your process is:
[1] Create Job in Nomad
[2] View Nomad Jobs in Portainer
Can you check your Nomad Jobs and see if you have any in a dead
state? We have an internal request logged to resolve. If you have a dead
Job, Portainer will not display Jobs. The workaround here would be to remove the dead
Job from Nomad.
Interim, I can deploy a Nomad instance and test your Job on my end. I will update you as I learn more.
Thanks!
Thanks!
Can you check your Nomad Jobs and see if you have any in a dead state?
Nope, the jobs were healthy and running. They also had allocations and everything. The jobs were doing their jobs :)
Bug description I setup an Edge Agent to connect to my Nomad cluster. Setup looks fine. I can see the Nomad cluster under environments and can open its dashboard. But when clicking "Nomad Jobs" I get errors from the api.
{"message":"Unable to list allocations","details":"failed to get the latest deployment for job home-heimdall in namespace default"}
It complains one service after another. Once I stop and purge the next one throws an error.
Expected behavior No errors.
Portainer Logs the only errors in the logs are but I think they unrelated
Steps to reproduce the issue:
Technical details:
Portainer Business Edition2.17.1
linux
v1.5.3
docker run -p 9443:9443 portainer/portainer
):Additional context Add any other context about the problem here.