Fix bad interactions between timeouts and build retires

Merge Checklist

All boxes should be checked before merging the PR (just tick any boxes which don't apply to this PR)

[x] The toolchain has been rebuilt successfully (or no changes were made to it)
[x] The toolchain/worker package manifests are up-to-date
[x] Any updated packages successfully build (or no packages were changed)
[x] Packages depending on static components modified in this PR (Golang, *-static subpackages, etc.) have had their Release tag incremented.
[x] Package tests (%check section) have been verified with RUN_CHECK=y for existing SPEC files, or added to new SPEC files
[x] All package sources are available
[x] cgmanifest files are up-to-date and sorted (./cgmanifest.json, ./toolkit/scripts/toolchain/cgmanifest.json, .github/workflows/cgmanifest.json)
[x] LICENSE-MAP files are up-to-date (./LICENSES-AND-NOTICES/SPECS/data/licenses.json, ./LICENSES-AND-NOTICES/SPECS/LICENSES-MAP.md, ./LICENSES-AND-NOTICES/SPECS/LICENSE-EXCEPTIONS.PHOTON)
[x] All source files have up-to-date hashes in the *.signatures.json files
[x] sudo make go-tidy-all and sudo make go-test-coverage pass
[x] Documentation has been updated to match any changes to the build system
[x] Ready to merge

Summary

When we queue a package to build (or test), we set a timeout (by default 8h). If the build has not finished by then we forcibly stop the build and mark it as failed.

We also support PACKAGE_BUILD_RETRIES and CHECK_BUILD_RETRIES, which will cause failed builds to re-run.

However, each time the retry was triggered the timeout would reset. For example in the buddy builds this means that a stuck package test could take 4x8=32h to build, which would exceed pipeline time limits. We want to exit gracefully with an error state so that we can generate and publish logs correctly. If the pipeline forces the timeout, it can be difficult to debug.

Instead of resetting the timeout with each retry, have all attempts share a single timeout. If the timeout is exceeded stop retrying (use RunWithLinearBackoff() which will take a ctx configured with a timeout, so we can break out early).

As part of this fix, I also noticed that the timeout handling was not cleaning up the build chroot correctly. We should not be using anything related to panic() for error handling, instead use logger.Log.Fatal*() which gives the logging library a chance to run its registered cleanup functions (ie final chroot cleanup) before exiting "gracefully".

Change Log

Package build timeout shared by all retry attempts, each invocation of BuildAgent.BuildPacakge() now takes a time.Duration instead of using the value from BuildAgentConfig.
Properly clean up build chroot on timeout
- Handle timeout logic inside the chroot.Run so we correctly exit the chroot before leaving the function, otherwise the chroot cleanup code will run from within the chroot itself and the paths will be wrong.
- Add a new StopAllChildProcesses() which is like PermanentlyStopAllChildProcesses() but does not set the disable flag (so we can run the gpg-agent cleanup still on exit).

Does this affect the toolchain?

Associated issues

https://microsoft.visualstudio.com/OS/_workitems/edit/53979716

Test Methodology

(Added custom %check to words with sleep 9h)

https://dev.azure.com/mariner-org/mariner/_build/results?buildId=642266&view=results

microsoft / azurelinux