tmspzz / Rome

Carthage cache for S3, Minio, Ceph, Google Storage, Artifactory and many others
MIT License
819 stars 57 forks source link

Avoiding binary incompatibilities between codependent frameworks #200

Open elliottwilliams opened 4 years ago

elliottwilliams commented 4 years ago

Right now, Rome uses the resolved version of a framework to distinguish it inside the cache. This is nice because it parallels what Carthage writes to the resolved Cartfile, but it leads to potential binary incompatibilities between when frameworks in a project depend on each other. If one framework is updated but its consumer isn't, the latter framework can become out of sync in the cache and lead to runtime crashes in the project.

I'm curious if Rome has ever thought about hashing a framework's version with its dependencies' versions, and using that as a cache key. This would mean that resolving and building a new version of a framework's dependency would also cause that framework to be seen as missing and reuploaded.

Steps which explain the enhancement

Consider the following Cartfiles:

Cartfile:
  git "../leaf" ~> 1.4.0
  git "../node" == 1.2.3

../node/Cartfile:
  git "../leaf" ~> 1.0.0

../leaf/Cartfile:
  # empty

Say leaf's most recent version is 1.4.0. Carthage resolves and builds leaf 1.4.0 and node 1.2.3. Rome uploads these, producing the following cache files:

leaf/iOS/leaf.framework-1.4.0.zip
leaf/iOS/.leaf.version-1.4.0
node/iOS/node.framework-1.2.3.zip
node/iOS/.node.version-1.2.3

Next, suppose leaf releases 1.5.0, which contains a binary-breaking change. (For instance, imagine a parameter is added to a function with a default value. The default value means that node's existing source code will compile, but the symbol name will have changed.)

Locally, we update our Cartfile to require git "../leaf" ~> 1.5.0 and run carthage update leaf --cache-builds:

*** Invalid cache found for leaf, rebuilding with all downstream dependencies
*** Building scheme "leaf" in leaf.xcodeproj
*** Building scheme "node" in node.xcodeproj

Carthage understands the dependency relationship, and has conservatively rebuilt both leaf and node. This is great! It means that locally, we've got versions of both dependencies that must be binary-compatible.

Current and suggested behavior

However, when rome list --missing is used to detect missing dependencies, it only reports that leaf 1.5.0 is missing, since the cache already contains a build of node 1.2.3. If you're using the --cache-builds workflow, Rome only uploads leaf 1.5.0, meaning that the cache now contains:

leaf/iOS/leaf.framework-1.4.0.zip
leaf/iOS/.leaf.version-1.4.0
leaf/iOS/leaf.framework-1.5.0.zip
leaf/iOS/.leaf.version-1.5.0
node/iOS/node.framework-1.2.3.zip  # built using leaf 1.4.0
node/iOS/.node.version-1.2.3  # built using leaf 1.4.0

The next time someone else uses the cache, Rome will give them a bad version of node — one that only works with leaf 1.4.0. That build will crash on launch with an error from dyld.

Why would the enhancement be useful to most users

If Rome hashed frameworks using resolved versions of their dependencies, the cache could contain different builds of the same version. Given version hashes like:

leaf 1.4.0 = wwww
leaf 1.5.0 = xxxx
node 1.2.3 + leaf 1.4.0 = yyyy
node 1.2.3 + leaf 1.5.0 = zzzz

the cache would contain products like:

leaf/iOS/leaf.framework-1.4.0-wwww.zip
leaf/iOS/.leaf.version-1.4.0-wwww
leaf/iOS/leaf.framework-1.5.0-xxxx.zip
leaf/iOS/.leaf.version-1.5.0-xxxx
node/iOS/node.framework-1.2.3-yyyy.zip  # built using leaf 1.4.0
node/iOS/.node.version-1.2.3 yyyy  # built using leaf 1.4.0
node/iOS/node.framework-1.2.3-zzzz.zip  # built using leaf 1.5.0
node/iOS/.node.version-1.2.3 zzzz  # built using leaf 1.5.0

and Rome could download the correct build of node regardless of the pinned version of leaf.

Personally, I'm evaluating Rome in an organization that has a lot of tightly coupled dependencies—I can imagine this is less useful when you have few codependencies in a repo, but afaict this is a general purpose cache invalidation problem. Let me know if this makes sense! I'd be happy to try to dive into it if you think it's a useful enhancement.

Rome version: 0.23.1.61 OS and version: macOS 10.14.6

tmspzz commented 4 years ago

I think you’re holding it wrong (you can of course disagree with me).

If you update a leaf dependency from 1.4.0 to 1.5.0 with a breaking change, you’re violating semantic versioning rules.

On top of that, you should update also the version on node by at least a minor number since you would have to update the api at call site of leaf.

Interesting idea. How would one map from Cartfile.resolved committish to the final hash?

elliottwilliams commented 4 years ago

Ah, I came back to agree with your stance on semver 😆 But I'm happy to discuss further.

If you update a leaf dependency from 1.4.0 to 1.5.0 with a breaking change, you’re violating semantic versioning rules.

I think this is a reasonable stance—the real point of contention is whether an ABI-breaking but source-compatible change is "breaking" the context of semver. IMO, It probably is if you're thinking about frameworks as discrete binaries (similar to Apple's definition of "binary frameworks"). It probably isn't if you're thinking about frameworks as bundles of source code that are buildable on demand.

How would one map from Cartfile.resolved committish to the final hash?

In pseudocode, the version hash for a dependency is the hash of its version plus the hash of its subdependencies' versions:

version_hash_for_dependnecy(dep) = sha1(
    dep.pinned_version + 
    map(version_hash_for_dependnecy, dependencies_for_dependencies(dep))
)

This assumes that the dependencies are checked out, so that you can find a Cartfile for each dependency and build up a dependency graph.

elliottwilliams commented 4 years ago

While I was messing around with this, I wrote an implementation of the above algorithm in Python, which might be relevant for discussion :) https://gist.github.com/elliottwilliams/fdf7730ef06809abeb88299a97d57ffa