biocro / biocro

https://biocro.github.io/
Other
41 stars 19 forks source link

Prevent bogus error result from document quantities workflow #93

Closed gsrohde closed 7 months ago

gsrohde commented 7 months ago

This addresses issue #92.

eloch216 commented 7 months ago

I Googled the git diff --quiet HEAD; part to see what it does, and stumbled across this: https://stackoverflow.com/questions/28296130/what-does-this-git-diff-index-quiet-head-mean

Would it be safer to use git diff-index --quiet HEAD --; instead?

gsrohde commented 7 months ago

I Googled the git diff --quiet HEAD; part to see what it does, and stumbled across this: https://stackoverflow.com/questions/28296130/what-does-this-git-diff-index-quiet-head-mean

Would it be safer to use git diff-index --quiet HEAD --; instead?

Funny you should bring up git diff-index. That was the command I had originally before switching to git diff. (Not sure why—I copied it from a script in another project I was working on.) But I was getting really wonky behavior: the command was returning a non-zero exit code when nothing had changed, and then the "else" branch was erroring out when it tried to "git commit" zero changes.

But then I added a "git diff" call a few lines before the "if" test to try to debug this, and suddenly everything worked: that is, git diff-index now correctly returned an exit code of 0 and the "else" branch with the failing "git commit" call was avoided. At this point, I felt I was a victim of the observer effect, threw up my hands in my attempt to debug this, and then wondered why I was using git diff-index to begin with when I was really much more familiar with git diff, so I switched to it, and the problem went away. It's also my understanding that "porcelain" commands (git diff in this case) are considered more stable than "plumbing" commands (git diff-index), and so preferred.

(To see the effect of the addition of git diff, compare Run #20, which has the extra git diff call and succeeds, with Run #21, where I again removed it. The respective commit trees are tagged Run-20 and Run-21. By the way, I did manage to find a similar though clearly different bug reported about diff-index (see https://github.com/git/git/commit/6a044a20480a8ef56f7ddb8142f660ca01a3391e). I don't know, however, whether that bug is also subject to the observer effect.)

You do have a good point about adding -- at the end of the command, however. git diff --quiet HEAD may not give the result we want if there happens to be a file called HEAD at the top level.

I opted for a slightly different solution though. First, I moved the git add . call to before the if test. This corrects a possible problem which might occur if we ever changed the names of the files output by the script or ever changed the location of the documentation repository: if the files copied in step 11 are not being tracked yet, the git diff call won't pick up any differences, and the "nothing to commit" branch will be followed. git add . will ensure that every file under the current directory is being tracked and that any new or changed files are staged. Then I use git diff --quiet --cached to check whether the git add actually staged anything. (I could have used git diff --quiet HEAD -- just as easily in this context, since it detects if there are any tracked files in the working copy (staged or not) that differ from what's in HEAD. Or I could have used git diff --quiet --cached HEAD --, which is exactly equivalent to git diff --quiet --cached, since <commit> defaults to HEAD if not specified.)

gsrohde commented 7 months ago

I realized I didn't add anything to NEWS.md. But that brings up a question: Do we even want an item about this in the news file? These sorts of changes are hardly relevant, it would seem, to general users. The changed files, in fact, don't get included in even the source version of the BioCro R package. (I first became aware of this issue in reading through the NEWS.md file while assessing PR #88 and the imminent submission of the package to CRAN.)

eloch216 commented 7 months ago

I realized I didn't add anything to NEWS.md. But that brings up a question: Do we even want an item about this in the news file? These sorts of changes are hardly relevant, it would seem, to general users. The changed files, in fact, don't get included in even the source version of the BioCro R package. (I first became aware of this issue in reading through the NEWS.md file while assessing PR #88 and the imminent submission of the package to CRAN.)

My two cents: I think it's natural to treat NEWS.md as the changelog for the entire repository, not just the files included with the R package. Otherwise we would need to have separate changelogs for different parts of the repository, which seems like a lot of work for very little benefit. Things that aren't relevant to users can be included in a subheading for things that only developers would care about. At the same time, I think it's possible to have a "significance threshold" for NEWS.md, and we can exclude things that seem small. For example, one could probably argue that this PR, the associated issue, and the git blame will be enough of a record to explain what happened.

gsrohde commented 7 months ago

one could probably argue that this PR, the associated issue, and the git blame will be enough of a record to explain what happened.

I'm either persuaded by that argument or just being lazy. In any case, I'm merging this.