Closed EugenKon closed 1 year ago
Hi @EugenKon! Your reproduction doesn't say where you're seeing the error, and the reproduction steps all look successful. Is this in response to a specific Nomad command after you've done that?
@tgross I got that error here:
Also it would be nice to see:
processing template destination "secrets/file.env"
Success
processing template destination "local/script.sh"
Success
Running command = "/bin/bash" args = ["local/script.sh"]
...
Or something similar. This will allow easy debugging.
I noticed same error when do on my local host:
nomad alloc exec -task postgres-task fake_ID ls -la /backup/
Error querying allocation: Unexpected response code: 500 (alloc lookup failed: index error: Invalid UUID: encoding/hex: invalid byte: U+006B 'k')
So error message belongs to nomad alloc exec -task postgres-task $DB_ALLOC_ID bash -c 'is_ready 120; backup'
part.
Weird. Error message depends on which ID I provide:
$ nomad alloc exec -task postgres-task qw ls -la /backup/
Error querying allocation: Unexpected response code: 500 (alloc lookup failed: index error: Invalid UUID: encoding/hex: invalid byte: U+0071 'q')
$ nomad alloc exec -task postgres-task ew ls -la /backup/
Error querying allocation: Unexpected response code: 500 (alloc lookup failed: index error: Invalid UUID: encoding/hex: invalid byte: U+0077 'w')
And a couple of meaningful responses
$ nomad alloc exec -task postgres-task ab ls -la /backup/
No allocation(s) with prefix or id "ab" found
$ nomad alloc exec -task postgres-task ac ls -la /backup/
No allocation(s) with prefix or id "ac" found
More changes to ID lead me to next different error:
$ nomad alloc exec -task postgres-task ae ls -la /backup/
Could not find task named: postgres-task, found:
postgres-backup
weird =) I specified the task, but got error. Only when I remove -job flag it starts to work. But this is a different issue I think.
UPD. I added nomad server logs to the first message. journalctl -xeu nomad
Ok this log line tells us why you're seeing the error in the command line.
http: request failed: method=GET path=/v1/allocations?prefix=ax
The nomad alloc exec
command docs show this usage:
nomad alloc exec [options]
[ ...]
When you do a command like: nomad alloc exec -task postgres-task qw ls -la /backup/
you're asking for an allocation ID that starts with qw
. Allocation IDs are always UUIDs, so you're giving the command something that will never succeed in looking up correctly.
But your initial report was about the UI, which should of course be sending a real allocation ID here to this API. Unfortunately I can't seem to reproduce that behavior here. Is it possible for you to grab the actual request being made in the browser developer tools?
Is it possible for you to grab the actual request being made in the browser developer tools?
How a browser belongs to nomad alloc exec
command? I am not sure what do you mean.
Oh I'm sorry, I was definitely misunderstanding what you were seeing in the browser. The browser is showing you the logs from your allocation, which in turn is calling nomad alloc exec
.
So the problem is in the query you're making to populate the DB_ALLOC_ID
environment variable:
template {
env = true
destination = "secrets/file.env"
data = <<-EOH
# here you also might want to set NOMAD_TOKEN env
# if you're using ACL capabilities
# As service 'postgres-node' is registered in Consul, we want to grab its 'alloc' tag
{{- range $tag, $services := service "postgres-node" | byTag -}}
{{if $tag | contains "alloc"}}
{{$allocId := index ($tag | split "=") 1}}
DB_ALLOC_ID="{{ $allocId }}"
{{end}}
{{end}}
EOH
}
Are all the tags actually allocation UUIDs?
Sorry, probably I made poor explanation. I know, that a problem with alloc
I did not set it up by intention. As I have found later, the problem is with nomad exec alloc
. I do not expect 'Invalid UUID: encoding/hex: invalid byte:' error here at all.
Investigating into this problem I also found that some service, to which nomad exec alloc
does a query failed with '500 error' or with Could not find task named: postgres-task, found: postgres-backup
The only expected message here is: No allocation(s) with prefix or id "ab" found
where ab is ID of allocation. For other my queries with broken/not exist fake_ID, ae IDs, the service should not die and must return correct error messages: No allocation(s) with prefix or id "fake_ID" found
So for now the correct issue message should be: Service should not die with '500 internal service error' message when broken search string is passed. And it would be nice if nomad will return more debugging info to simplify debugging when such error messages occur.
So for now the correct issue message should be: Service should not die with '500 internal service error' message when broken search string is passed. And it would be nice if nomad will return more debugging info to simplify debugging when such error messages occur.
It's a validation error, and we should be returning 400 on that.
Nomad version
Operating system and Environment details
Issue
Reproduction steps
Expected Result
No error message
Actual Result
index error: Invalid UUID: encoding/hex: invalid byte: U+0073 's')
I can not figure out from where it comes? Please note, I have specified service, but I do not havealloc=xxxx
tagJob file (if appropriate)
Nomad Server logs (if appropriate)
No logs available via web interface:
Nomad Client logs (if appropriate)