We often merge multiple branches together when creating a simulation capability, and our current provenance is insufficient for recreating the code/env after such an operation.

For example, the automated timing procedure for lazy runs goes something like:

emirge install parallel-lazy
merge y1-production

We record the hashes for these branches in the timing script - but the built-in provenance (i.e. versions.sh --> sqlite) misses the mark with its hashes after the y1-production merge.

Some possibilities for dealing with this issue:

Extend emirge to know about merged branches - provide a list, one branch to install, a number of branches to merge, record hash for each

Enhance version.sh to recognize and report when merges are done:


# AK: This script might help version.sh discover merges
# Requires: git merge <branch> -m "temp" --no-ff
#! /bin/bash

FIXME Maybe loop looking for more temp commits?

if test "$(git show -s --format=%s HEAD)" = "temp"; then for parent in $(git show -s --format=%P HEAD) ; do echo -n "$parent" git name-rev "$parent" done fi



* Going forward, we want to make sure that timings (or other production drivers) work off of branches that are checked in and emirge-compatible.

illinois-ceesd / emirge

Tracking provenance through merges #140

FIXME Maybe loop looking for more temp commits?