eclipse-openj9 / openj9

Eclipse OpenJ9: A Java Virtual Machine for OpenJDK that's optimized for small footprint, fast start-up, and high throughput. Builds on Eclipse OMR (https://github.com/eclipse/omr) and combines with the Extensions for OpenJDK for OpenJ9 repo.
Other
3.29k stars 722 forks source link

Jenkins child jobs not stopped by interrupt signal #20396

Open pshipton opened 1 month ago

pshipton commented 1 month ago

Carry on the discussion from https://github.com/eclipse-openj9/openj9/issues/20395#issuecomment-2430033776

@JasonFengJ9 commented: So Click here to forcibly terminate running steps doesn't abort the child jobs. Do we have to log in to the machine manually to kill the processes?

github-actions[bot] commented 1 month ago

Issue Number: 20396 Status: Open Recommended Components: comp:build, comp:vm, comp:test Recommended Assignees: adambrousseau, babsingh, jasonfengj9

pshipton commented 1 month ago

It depends what the child job is doing. It's a known issue with setting ITERATIONS that an interrupt signal only stops the current iteration and the job continues with the next iteration.

When you interrupt the first time it signals all the child jobs and the parent waits for them to finish. By using Click here to forcibly terminate running steps it may send another interrupt to the child jobs and then it forcibly kills the parent so it's not waiting for them to finish any more.

JasonFengJ9 commented 1 month ago

Okay, what's the reliable way to ensure all child jobs are aborted, or is there such a link instead of logging in to the machine manually?

pshipton commented 1 month ago

The only way I know to kill the child jobs is to keep stopping the parent job until all the child iterations are finished. Or open the child jobs individually and do the same. If each child isn't running the same iteration, continuous stopping may interrupt the cleanup steps at the end of job.

It's been a known behavior for some time, not sure if @llxia has any plans to change it.

llxia commented 1 month ago

I think we can try to check currentBuild.result inside the loop. If it is ABORTED, then break.

@annaibm can you try to add the check inside for https://github.com/adoptium/aqa-tests/blob/2c47e24c512b6419c8b2a01b75ae2d10c4dd9215/buildenv/jenkins/JenkinsfileBase#L766-L787?

llxia commented 1 month ago