Kamilcuk / nomad-tools

Set of tools and utilities to ease interacting with HashiCorp Nomad scheduling solution.
https://github.com/Kamilcuk/nomad-tools
GNU General Public License v3.0
22 stars 2 forks source link

unable to find uuid when running inside of docker #1

Closed foozmeat closed 8 months ago

foozmeat commented 8 months ago

I'm trying to run nomad-watch from CI. We're using a private gitlab-runner and this is running inside of a docker container. This job worked OK with plain nomad. Here's the error output. If I run this same job file locally using nomad-watch it seems to work OK. The job in question is a periodic job scheduled to run every 4 hours.

$ /usr/local/bin/nomad-watch run job.nomad
nomad-watch>677> INFO + nomad job run -detach -verbose job.nomad
nomad-watch>685> INFO Job registration successful
nomad-watch>685> INFO Approximate next launch time: 2023-11-06T20:00:00-08:00 (3h25m4s from now)
Traceback (most recent call last):
  File "/usr/local/bin/nomad-watch", line 8, in <module>
    sys.exit(nomad_watch.cli())
             ^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/nomad_tools/nomad_watch.py", line 1526, in mode_run
    evaluation = nomad_start_job(cmd)
                 ^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/nomad_tools/nomad_watch.py", line 698, in nomad_start_job
    evalid = __nomad_job_run(opts)
             ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/nomad_tools/nomad_watch.py", line 691, in __nomad_job_run
    assert len(founduuids) == 1, f"Could not find uuid in nomad job run output"
           ^^^^^^^^^^^^^^^^^^^^
AssertionError: Could not find uuid in nomad job run output

Uploading artifacts for failed job
foozmeat commented 8 months ago

Correction, when I run this particular job file locally I get the same error. Perhaps because its periodic?

Kamilcuk commented 8 months ago

Perhaps because its periodic?

Hi. Definitely. I did not test anything with periodic nor parametrized jobs. I rarely use them if ever. They do not start immediately, so there is nothing to "watch". I wonder what behavior would I want from this tool in such case. I guess it should wait until the job is started - so wait for 3h25m4s and then watch over the job? This is the behavior you expected?

foozmeat commented 8 months ago

Its a good point, what does it mean to watch a job that may not start right away. Perhaps this can be closed since periodic jobs seem to fall outside the scope of what the tool is trying to do?