greenkeeperio / greenkeeper-lockfile

:lock: Your lockfile, up to date, all the time
https://greenkeeper.io
183 stars 73 forks source link

Why does the upload task check for firstPush? #136

Closed patkub closed 6 years ago

patkub commented 6 years ago

Over here, the upload task checks for firstPush and then uploadBuild. But, the update task already checks for firstPush. Since the upload task runs after the update task, why does the upload task also check firstPush? Is the uploadBuild check not enough?

Removing the firstPush check from the upload task makes it useful in workflows. This would let the user upload the lockfile at the end of a workflow. The firstPush check is needed in the update task so that the lockfile is only updated once, and so that a race condition does not occur in concurrent jobs in a workflow.

This is possibly a better fix for https://github.com/greenkeeperio/greenkeeper-lockfile/pull/118

nevir commented 6 years ago

Similarly, the main motivation behind #118 was to be able to upload the lockfile change only if the tests pass (which requires that greenkeeper-lockfile not run as the first task in the workflow)

nevir commented 6 years ago

why does the upload task also check firstPush? Is the uploadBuild check not enough?

I think the answer is that we don't want to commit additional lockfile updates if someone pushes more commits to the greenkeeper branch.

The more I dig into this, the more I think #118 (or something that uses git as the source of truth; but #118 as-is doesn't seem to actually work) is actually the right fix:

nevir commented 6 years ago

For now, I'm working around this by always uploading the lockfile during the first workflow step, regardless of the run's success.

It lets me inspect the exact set of dependencies that caused the error that way, too.

patkub commented 6 years ago

We have CIRCLE_JOB, or its alias CIRCLE_STAGE, environment variables for the current job in the workflow. Could we add new environment variables, GK_LOCK_UPDATE_JOB and GK_LOCK_UPLOAD_JOB, to specify the job to update/upload in?

In the current checks, uploadBuild is true by default in every job in a workflow. Every job in a workflow gets env.CIRCLE_NODE_INDEX === 0. ~CIRCLE_NODE_INDEX is used to set commands to run in parallel.~

Edit: That's the 1.0 docs, looks like parallelism in 2.0 does not use these environment variables, but they are still set.

Instead of

firstPush: !env.CIRCLE_PREVIOUS_BUILD_NUM,
uploadBuild: env.CIRCLE_NODE_INDEX === `${env.BUILD_LEADER_ID || 0}`

Could this work?

firstPush: checkFirstPush(),
uploadBuild: env.CIRCLE_NODE_INDEX === `${env.BUILD_LEADER_ID || 0}`

function checkFirstPush() {
  if (env.GL_LOCK_UPDATE_JOB !== NULL)
    // if GL_LOCK_UPDATE_JOB is set, check that it matches the current job, and make sure there is only 1 commit on branch
    return env.CIRCLE_JOB === env.GL_LOCK_UPDATE_JOB && gitHelpers.getNumberOfCommitsOnBranch(env.BITRISE_GIT_BRANCH) === 1
  // otherwise, check that CIRCLE_PREVIOUS_BUILD_NUM is not set
  return !env.CIRCLE_PREVIOUS_BUILD_NUM
}

For the job you want to upload in, set GL_LOCK_UPDATE_JOB = CIRCLE_JOB

nevir commented 6 years ago

Could we add new environment variables, GK_LOCK_UPDATE_JOB and GK_LOCK_UPLOAD_JOB, to specify the job to update/upload in?

I'm not sure how useful that is in the end; I don't think we need the extra complexity here, unless I'm misunderstanding something?

The developer is already being very explicit about which script is run in which step via the definition of the step, for a Circle 2.0 workflow. For example: https://github.com/convoyinc/apollo-cache-hermes/blob/master/.circleci/config.yml

Having to specify the step name, in addition to declaring the script run for only that step, seems redundant

patkub commented 6 years ago

You are right, this would just be redundant and not fix the issue. I was trying to re-work the firstPush check because #118 fails with "Only running on first push of a new branch".

patkub commented 6 years ago

Instead of the firstPush check, could we set an environment variable that is persistent across workflows on the new greenkeeper branch?

if !GK_WAS_UPLOADED:
    upload the lockfile
    GK_WAS_UPLOADED=true

Next workflow that runs on the greenkeeper branch will have GK_WAS_UPLOADED=true and not upload again.

patkub commented 6 years ago

@nevir Looks like firstPush is going away with monorepos in https://github.com/greenkeeperio/greenkeeper-lockfile/pull/140