canonical / testflinger

https://testflinger.readthedocs.io/en/latest/
GNU General Public License v3.0
10 stars 17 forks source link

Agent page on testflinger.c.c does not tell me what job is being run when I look at an agent's page #307

Open bladernr opened 2 months ago

bladernr commented 2 months ago

Screenshot from 2024-07-10 16-45-26

I kicked off a job against a machine, and while the agent page does indeed tell me it's currently in a provision state, the polling output on my local console shows nothing at all:

$ testflinger poll 5e7990d4-cb14-494f-861f-57578ab25c32 This job is waiting on a node to become available. Jobs ahead in queue: 0

So I had hoped this would tell me exactly what job is currently being executed, so I could quickly tell if my job is the one provisioning, or if it is some other job that just happened to get in before me.

I had to search to find it in the jobs page instead and it did turn out that a job I had launched and thought I had cancelled was still running (I didn't cancel it, after all, it seems).

Interestingly, after this I DID cancel it using testflinger cancel <JOB ID> and THEN the job that was blocking me shows up in the provision history

Screenshot from 2024-07-10 16-53-23

So I guess the resolution here would be some indicator on the agent details that says something like:

Currently running job: JOB_ID

And even bonus points if possible to make here clickable to see the job itself (as you get when running testflinger show <JOB ID>

syncronize-issues-to-jira[bot] commented 2 months ago

Thank you for reporting us your feedback!

The internal ticket has been created: https://warthogs.atlassian.net/browse/CERTTF-357.

This message was autogenerated

plars commented 2 months ago

I'm open to ideas, but this was something I considered in the past and intentionally avoided. Because we don't currently have any way to restrict who can cancel a job without adding a whole lot more pieces. While it looks like everyone mostly plays nice and avoid touching jobs that aren't theirs, putting it in the output there makes it just a cut/paste away from canceling someone else's job that's ahead of you in the queue.

On the other hand - if you are polling output then you will already start seeing output when your job starts. If there are 0 jobs ahead of it, then it should be soon, depending on how long the existing job takes and the possibility of other queues that could have something in them.

Also, the reason the other job showed up in the provision history later, was because it doesn't have the information to put in the provision history until it's done provisioning. It does push that to the server as soon as provisioning is done though, rather than waiting until the end of the job.

bladernr commented 2 months ago

That's reasonable, but I'd also mention that anyone can just go to the Jobs page, find the jobs currently in queue for a given node, and also copy/paste/cancel already.

Maybe this can be just kept in mind for a stage when TF can at least tell you WHO submitted a job. That would be helpful

Job XYZ is running, submitted by $SUBMITTER_ID at $TIMESTAMP

Actuallty, $submitterID could even, for now, be pretty easy by just adding a field to the job yaml that is required:

submitter: <launchpad ID>

or

submitter: Cert Github Runner

or

submitter: Some Team's Jenkins

plars commented 2 months ago

I think I may have misunderstood. For some reason I was thinking you meant from the cli but I had already been thinking about putting it on the web UI, which I think is much nicer. here's a current mockup of what I was thinking. Let me know what you think: image