game-ci / versioning-backend

Stateful backend to keep track of unity versions and docker build queues
MIT License
11 stars 4 forks source link

Improve the job retry logic #40

Closed GabLeRoux closed 2 years ago

GabLeRoux commented 2 years ago

Recent unity versions are not being published to docker hub anymore due to our retry logic. Right now, Windows based images are failing due to https://github.com/game-ci/docker/issues/154 and we are reaching the number of retries limit for a given unity version, which prevents other working targets to be published.

https://game.ci/docs/docker/versions

Here are the currently failed images:

image

It is still unclear what is going on with the retry logic because some recent images succeeded to build at least the ubuntu based images, but others didn't.

Example of success: editor-2021.2.9f1-0.17.0

This version doesn't seem to have hit the current issue

image

Example of failure: editor-2021.2.11f1-0.17.0

This specific version seem to have hit the retry logic problem.

image
webbertakken commented 2 years ago

Will have a look at this tonight. But I don't think there's anything wrong with the retry logic. It's just a commit that landed after the hub was already published for that version.

Ultimately we could make the workflow check out the the tag version of the repository instead of main, but that will bring a few more challenges with regards to performing fixes to the build flow.

uchar commented 2 years ago

@GabLeRoux @webbertakken unity 2020.3.28 and unity 2020.3.29 images are not available, would you please fix the problem?

webbertakken commented 2 years ago

I'll have time for this next weekend.

GabLeRoux commented 2 years ago

To update here, requested versions in above comments were published successfully.

https://game.ci/docs/docker/versions

We identified and fixed the root cause of the failing builds.

The current implementation of the retry logic is probably fine.

Before we close this issue, maybe we should answer this question:
What happens if it fails to build the windows based images of a given version, will the ubuntu based images still be published?

If answer is yes, than we can safely close this issue :)

webbertakken commented 2 years ago

Consider Jobs schedule Builds, where Jobs represent dockerRepoVersion-EditorVersion and Builds represent a baseOs-targetPlatform for a Job.

The way I ended up implementing this is that each Job schedules all Builds, thus: all targetPlatforms across all baseOses for that dockerRepoVersion and unityEditorVersion.

Together they share the same buildQueue and buildFailureCount. The setting currently look like this:

image

The retry logic itself was never really "broken". We just added windows-based builds also be rescheduled and reported in, in #36, #37.

GabLeRoux commented 2 years ago

Thanks for the detailed explanation <3 Closing this as I don't think we need to change anything for now :)