gitter-lab / SINGE

Gene regulatory network reconstruction from pseudotemporal single-cell gene expression data
MIT License
11 stars 5 forks source link

Automatically match compiled binary to versions of the MATLAB files #35

Closed agitter closed 4 years ago

agitter commented 4 years ago

We would like to automatically detect if the compiled SINGE binary is stale with respect to the MATLAB files in the repository. The intended behavior is that a Travis CI build fails if the code has been updated but the compiled binary has not.

agitter commented 4 years ago

One option is to have a script that wraps a few compile-related commands:

We can use a command like git ls-tree -r HEAD --name-only | grep '.*\.m$' to get a list of all the .m files that are currently tracked in the repository (see https://stackoverflow.com/questions/15606955/how-can-i-make-git-show-a-list-of-the-files-that-are-being-tracked/15606998 and https://stackoverflow.com/questions/13335837/how-to-grep-for-a-file-extension)

md5sum can operate on multiple files so something like this might work

$ md5sum $(git ls-tree -r HEAD --name-only | grep '.*\.m$') > code.md5
$ cat code.md5
4d8e5f1a3b2690421200d26c5fdf8a4a *SCINGE_Example.m
038bd18e4b3a560fedd3e4b45aa8458e *code/GLG_Instance.m
5bc04046efac900d6854629d3914853e *code/Modified_Borda_Aggregation.m
cbdf46f263280dd2584777ab51c8a8d9 *code/SCINGE.m
696a2beecd7b836683e63255bdc8c24a *code/adjmatrix2edgelist.m
f2f1c011c39ae9b384acdc1fca30858c *code/dropSamples.m
be22108cb8ae1c78d760ddcd29d012ef *code/dropZeroSamples.m
50b9195ecaaa43554ae95d139cd3fbbd *code/iLasso_for_SCINGE.m
5b0784250e07a454ea15d92e6fb85c14 *code/normalizePseudotime.m
98e1fa061a97975d1ba53b9b01e05460 *code/parseParams.m
820c0c779c4d56da95d11a7b5bc675fe *code/run_iLasso_row.m

The cat is to illustrate the file contents but wouldn't actually be needed.

Then we can check the files still match

$ md5sum -c code.md5
SCINGE_Example.m: OK
code/GLG_Instance.m: OK
code/Modified_Borda_Aggregation.m: OK
code/SCINGE.m: OK
code/adjmatrix2edgelist.m: OK
code/dropSamples.m: OK
code/dropZeroSamples.m: OK
code/iLasso_for_SCINGE.m: OK
code/normalizePseudotime.m: OK
code/parseParams.m: OK
code/run_iLasso_row.m: OK

This would fail under some edge cases, like adding new .m files. Recomputing code.md5 during the Travis CI test and taking the diff may be more robust. We would also need to make sure this works for the released versions of the binary, which are hosted on GitHub.

agitter commented 4 years ago

@aabaker99 suggested that we also check the md5 of the binary, which is a good idea. We can run md5sum on the source and the new binary immediately after it is compiled.