AzBuilder / terrakube

Open source IaC Automation and Collaboration Software.
https://docs.terrakube.io
Apache License 2.0
488 stars 37 forks source link

job pending state #830

Open gdacal opened 4 months ago

gdacal commented 4 months ago

Bug description 🐞

After leaving a 'apply waiting for approval' yesterday, it caused the job to fail but it remained in a pending state indefinitely. To try to solve the problem, an attempt was made to cancel it but the button did not trigger any action, so information was sought on the web. It was decided to install the psql client in the Terrakube database. The public.job table was reviewed to check the status of job 246.

sql
SELECT * FROM public.job WHERE id = 246;

The job was indeed in a pending state, so it was decided to change it to canceled with the following query:

sql
UPDATE public.job
SET status = 'cancelled'
WHERE id = 246 AND status = 'pending';

Steps to reproduce


Expected behavior

No response

Example repository

No response

Anything else?

No response

alfespa17 commented 4 months ago

There is one validation that mark as failed all the jobs that has been running for more that 6 hours in "failed state" in this part of the code:

https://github.com/AzBuilder/terrakube/blob/398bcb5233e79d888f2f803e2ab372164705a949/api/src/main/java/org/terrakube/api/plugin/scheduler/ScheduleJob.java#L67

And the validation to check previous jobs in some specific status.

https://github.com/AzBuilder/terrakube/blob/398bcb5233e79d888f2f803e2ab372164705a949/api/src/main/java/org/terrakube/api/plugin/scheduler/ScheduleJob.java#L86

Not really sure why your job was in pending state after 24 hours.

stanleyz commented 1 month ago

@alfespa17 it's likely that the create time is created from UI following the timezone from user's session however the expiry time is calculated from the server which might be configured to use UTC.

I am also experiencing this issue in a GMT+12 time zone, wondering whether the token follows a similar calculation method.