Provide a unique, incremental build number for each build

mitchhentges commented 5 years ago

Right now, to uniquely identify produced artifacts for a build, we need to turn to external factors to get a unique identifier, such as:

Use the clock
- downside: running at the exact same time produces a chance of non-unique identifier
- downside: using full seconds (or milliseconds!) from epoch produces suuuper long ids, so we need to turn to fancy algorithms to produce short time-based ids
Use UUIDs
- downside: super long! 128 bits, and in some cases, we need to fit in a int32)
Use an external service (e.g. a microservice somewhere that atomically provides incremental ids
- downside: complexity of maintaining an additional microservice
- downside: if multiple build types start having their own microservice, then the lowest-common-denominator is, arguably, taskcluster -> maybe taskcluster should provide that service -> this RFC :)

Feature

For each build, an incremental, unique build number is provided
- Note: task-cluster is decentralized (-ish?), so this incremental build number would need to be synchronized
- Note: I'm not sure the best way to provide this to workers/build jobs: environment variable? I'm still learning taskcluster, so perhaps there's already a standard for providing information from TC to builds
- Note: my terminology might be wrong, but by "build", I'm referring to a full build (all tasks in a task group would have the same build number number injected)

djmitche commented 5 years ago

Taskcluster doesn't really have a notion of "for a task group". Nor does the service itself inject anything into tasks -- what is passed to queue.createTask is what becomes the task. So this would need to be a value available to the process calling queue.createTask. Probably a decision task?

You also correctly identify issues around "decentralized". But our data stores do have atomic operations, so it's not infeasible to build a "give me an integer" microservice. It wouldn't be especially scalable, but that's a technical issue of about the same severity as the "chance" of two releases in the same second.

Do the values need to be monotonically increasing? Or just unique?

Aside from the issues of "super long"-ness, I think revision identifiers and v4 UUIDs are the right choice (taskIds are what we refer to as "slugids" which are really just a 22-character representation of a UUID). Revision identifiers are stable and known to be unique. UUIDs have enough bits of randomness that we can safely assume they are unique. 32 bits is not really enough randomness for that assumption. So I would challenge the assumption that it has to fit into an i32 -- is this running on 32-bit embedded hardware or some situation where bits are at such a premium?

The "fancy algorithm" you link to seems pretty simple, too. I feel like building a microservice for this is overengineering a pretty simple solution. I can see a few trivial modifications of that algorithm that might get you better collision-avoidance. For example, if you counted tenths of a second since the first Focus release, you can enumerate 10 years in a u32, with very little chance of collision.

All that said, a service to issue unique named integers would be pretty simple to implement. If you can find a bit more popular support for this, and especially if you're willing to implement it, I think it's worth writing up an RFC for the purpose and circulating it for discussion.

mitchhentges commented 5 years ago

Thanks for the insight! I'm definitely getting up-to-speed on Mozilla CI, so my grasp on the strengths and pitfalls of our architecture isn't 100% just yet :smile:

For some additional context on why such a build number would need to fit into i32 space for my use-case, we're looking to use a unique value as a buildNumber for a Maven artifact. The Maven docs (Choose Snapshot in the bottom-left panel) show that buildNumber is stored into an int, and an int is dependent on how the JVM configures itself according to the processor architecture it's running on - 32 bits isn't unlikely.

Don't get me wrong, having build configurations using (a function of) time to uniquely version their artifacts isn't an incredibly large burden (especially on a per-project basis). However, I'd argue that, to a degree, this problem of uniquely identifying builds is generalizable to most builds running within Taskcluster. By pushing this responsibility to the project maintainer, we might be introducing more complexity to the ecosystem (e.g. creating questions like: "why aren't artifact buildNumbers consistent between projects?" or "why aren't we sharing logic for producing per-build artifact identifiers?")

Also worth noting is that projects where it's possible to have multiple builds running in parallel will be further restricted in options, since a time-based solution might not be granular enough.

Finally, I'm specifically proposing an incrementing-from-0 build number as a solution due to the accessibility of the concept - for new contributors getting up to speed on projects, it may be easier to understand how/why a build number being used in a CI build rather than manipulating time or unique identifiers. Perhaps I'm used to leaning on the abstractions used by other build systems like Teamcity, Travis and Jenkins :smile:

djmitche commented 5 years ago

The world of what people and organizations want to do in CI is more diverse than anyone can imagine. Most organizations think what they do is "normal", when to another organization it seems completely insane. The gecko CI system is objectively totally nuts, for example, although we all love it. But I've also had people tell me that "everyone" uses a different subversion repository for each subdirectory of their C++ project and thus needs a revision data structure of the form [(revision, repo-path), ..].

Teamcity, etc. impose a particular model, while Taskcluster is more of a collection of primitives from which anyone can build their own CI system. So the platform itself has no notion of a "revision", for example. Nor of a "build" or a "test". So the "why" questions you mentioned are higher-level than the platform is concerned with, and are better addressed to the people building the CI system on top of Taskcluster, rather than to TC itself.

A source of named monotonic sequences might be a good "primitive". I wonder if there's a slightly more general form of that service that could be useful for more than just generating build numbers? Or, is there a way that this could be combined with the functionality of the index service?

Perhaps the index service could provide a way to return the "next" key under a particular index path, in an atomic fashion? So every call to, say, index.uniqueSubpath("focus.builds", {revision, date}) would return a new subpath like focus.builds.8 with its data set to {revision, date} and also guarantee that any other call to index.uniqueSubpath("focus.builds", ..) would not return the same path. Then it's easy to parse out the index as an integer, and as a bonus you get an index path under which to put any generated artifacts.

mitchhentges commented 5 years ago

Ah, I think I understand! I suppose that if other CI solutions handle the stack from:

<...>
<build configuration/project (e.g. "Build $androidApp)>
<orchestrating worker agents>

then TaskCluster is specifically targeting the <orchestrating worker agents> part? I think it was this that I was mis-understanding :smile:

So, this feature request might be a bit mis-guided. I'll close this in the meantime

JohanLorenzo commented 5 years ago

I like this idea of using the TC index. I don't think we need it anymore, but I'll keep it in mind!

taskcluster / taskcluster-rfcs

Provide a unique, incremental build number for each build #133