dotnet / docker-tools

This is a repo to house some common tools for our various docker repos.
MIT License
124 stars 46 forks source link

Evaluate undefined service connection variables to empty strings at pipeline queue time #1319

Closed dagood closed 3 months ago

dagood commented 5 months ago

A task condition doesn't work to avoid serviceConnection: $(kusto.serviceConnectionName) because the validity of that type of auth is checked before the build starts (something like template-eval-time) rather than just before the task starts.

Note: I'm not actually sure if this works. It might just always disable the step. I only learned recently that variables work at all in template conditions, and I didn't find documentation that describes it very well so I don't know the edge cases. But I do expect that e.g. if a log command is used to change ingestKustoImageInfo from false to true, it wouldn't be able to react to that.

Example failed run: https://dev.azure.com/dnceng/internal/_build/results?buildId=2462622&view=results

There was a resource authorization issue: "The pipeline is not valid. Job Publish: Step AzureCLI3 input connectedServiceNameARM references service connection $(marStatus.serviceConnectionName) which could not be found. [...]"

dotnet-issue-labeler[bot] commented 5 months ago

I couldn't figure out the best area label to add to this PR. If you have write-permissions please help me learn by adding exactly one area label.

lbussell commented 5 months ago

I think this should work - I'll test it now and let you know.

lbussell commented 5 months ago

It works for excluding the steps, but doesn't seem to work when the variables are set to 'true': [internal link] https://dev.azure.com/dnceng/internal/_build/results?buildId=2463471&view=logs&s=380ab8c6-4139-55ff-15c4-fb48260f4dca

I'll check out the generated yaml to see if I can tell what's going on.

dagood commented 5 months ago

I see now that ingestKustoImageInfo is set at a pipeline level in this repo. I assume waitForIngestionEnabled is set similarly but just isn't in this repo.

Inside a job, pipeline variables don't show up (this seems to me to be the purpose of the internalProjectName: ${{ variables.internalProjectName }} passthrough in the pipeline yml), so it would make sense to me that a template expression can't see ingestKustoImageInfo either. I'm not sure if passthrough for this would work, but maybe worth a try.

lbussell commented 5 months ago

Yeah, that's my guess. In my experience the variables only show up after a couple of layers of pass-through as parameters.. it's not great. I'll try some experiments with it. First I'm just logging the value of the kusto variable before that step: https://dev.azure.com/dnceng/internal/_build/results?buildId=2463501&view=logs&s=380ab8c6-4139-55ff-15c4-fb48260f4dca

lbussell commented 5 months ago

Yep, as expected they resolve to empty strings

dagood commented 5 months ago

An alternative to just get this unblocked could be to go with "force this feature off" parameters that are passed all the way from the yml we control down into the common publish yml. 😄

Some background: for the (non-Docker) Go pipelines, I was fed up with matrices and variables and went essentially parameter and template-only, because they're more predictable and faster to iterate. (Errors tend to show up at eval time rather than only at runtime, and inspecting evaluated yml is more useful.) That might leave me with a blind spot on how to debug odd variable behavior like this or some other alternatives that might exist.

lbussell commented 5 months ago

I'm thinking that maybe the syntax serviceConnection: $[ variables['kusto.serviceConnectionName'] ] could cause it to be evaluated to an empty string, which would then disable trying to use the auth due to the condition in the run-imagebuilder template. Need to test that.

lbussell commented 5 months ago

@dagood I found a solution that works. I tested it by setting ingestKustoImageInfo and waitForIngestionEnabled to false, and then replacing the service connection variable names with variables that don't exist (so that they'd fail if we tried to use the service connections).

dagood commented 5 months ago

replacing the service connection variable names with variables that don't exist (so that they'd fail if we tried to use the service connections).

Assuming it works for the positive case: I suppose this would have to be because kusto.serviceConnectionName is in a scope where it can be resolved when the template is evaluated, but waitForIngestionEnabled wasn't? (Variable groups are "more global" than variables defined in the pipeline, or something?)

Not to say that everything in Azure Pipelines needs to be reasoned out (or can be), but thought it might be worth hypothesizing. 😄

lbussell commented 5 months ago

Just want to confirm that the positive case still works where this is enabled for our .NET builds?

Shoot, yeah, this doesn't actually work.

My intention was to get the service connections to resolve as empty strings, so that the run-imagebuilder.yml template wouldn't try to use them, and the steps would still be skipped by our condition: ... setting (I like seeing the skipped steps for these items even when they don't run).

It doesn't work because ${{ variables['kusto.serviceConnectionName'] }} is evaluated at compile time. Since our service connection names are in a variable group, it isn't populated until runtime (after the templates are generated, then it knows which variable groups to ask for permission to use).

I've been trying a few other things with no success:

  - template: /eng/common/templates/steps/run-imagebuilder.yml@self
    parameters:
      displayName: Ingest Kusto Image Info
      ${{ if eq(variables['ingestKustoImageInfo'], 'true') }}:
        serviceConnection: $(kusto.serviceConnectionName)
      internalProjectName: ${{ parameters.internalProjectName }}
      condition: and(succeeded(), eq(variables['ingestKustoImageInfo'], 'true'))
      args: > ...

^ This example never resolves the correct value for ingestKustoImageInfo for some reason. I even tried to set some default values for these so that the variable would hopefully be available at queue-time:

- ${{ if eq(variables['ingestKustoImageInfo'], '') }}:
  - name: ingestKustoImageInfo
    value: true
- ${{ if eq(variables['waitForIngestionEnabled'], '') }}:
  - name: waitForIngestionEnabled
    value: false

However I think that solution doesn't work because these variables themselves are not populated until after templates are generated, since they depend on the conditions inside the templates... yeah. If we remove the ${{ if eq(variables['ingestKustoImageInfo'], '') }}, will we still be able to override the variables from the pipeline queue GUI?

I'm open to other suggestions too.

mthalman commented 5 months ago

What about changing the implementation of run-imagebuilder.yml so that it doesn't include the step if serviceConnection is not set?

lbussell commented 5 months ago

@mthalman it already does that, is this what you mean?

https://github.com/dotnet/docker-tools/blob/c6e6d20f0fc2fda3eafc50fe3a111706e1fd11c6/eng/common/templates/steps/run-imagebuilder.yml#L28

The issue is getting the service connection string to...

  1. Resolve as the actual service connection name when the variable is set.
  2. Resolve as an empty string when it's not set (as opposed to trying to look for the service connection called $(foo.serviceConnectionName)).

I think the next thing to try is passing in the service connection variables as parameters to the publish stage. Elevating variables like that, in my experience, has allowed them to be accessible at pipeline queue time. That's how the internal/public project names are made available in the template expression on that same line (even though they're also variables).

mthalman commented 5 months ago

Resolve as an empty string when it's not set (as opposed to trying to look for the service connection called $(foo.serviceConnectionName)).

That's the part I was missing.

This should be able to be defaulted in a variables template to an empty string. Then it can be overridden via a variable group.

lbussell commented 5 months ago

OK, I just pushed some changes that should get this working. Here's some test runs for our different scenarios:

dagood commented 3 months ago

I tried to update to the latest eng/common today and hit similar issues with https://github.com/dotnet/docker-tools/commit/36550ca1167dcac578a27b3a7612c1de6df8e527 introducing more variables that are expected to be set at template-eval-time. But that made me realize something about this PR and this problem in general: if it works to add a variable with value '' to our config, we can simply do that in our repo/group to avoid maintenance pain here. I tried putting these in eng/pipeline/variables/common.yml, and that worked for the new variables and also kusto.serviceConnectionName.

lbussell commented 3 months ago

I tried to update to the latest eng/common today and hit similar issues with 36550ca introducing more variables that are expected to be set at template-eval-time. But that made me realize something about this PR and this problem in general: if it works to add a variable with value '' to our config, we can simply do that in our repo/group to avoid maintenance pain here. I tried putting these in eng/pipeline/variables/common.yml, and that worked for the new variables and also kusto.serviceConnectionName.

@dagood did this also work for any issues you had with marStatus.serviceConnectionName? Apologies for dropping the ball on this set of changes. I lost track of it among servicing + being away for a couple of weeks.

dagood commented 3 months ago

Yep, setting marStatus.serviceConnectionName to '' in Go's eng/pipeline/variables/common.yml worked fine in my dev/ build. (I expect defining it without a value in the variable group would work fine too, this just happened to be the way I tested it.) I see the step skipped during publish and there was no eval error. Using eng/common from 4c5996456d without modification.

No worries, Go builds are happy to chug along with a manual change for a bit as long as we don't urgently/frequently need to update. I wish I'd realized sooner that we could simply define the variables on the Go side and skip the scoping mess. 😄

dagood commented 3 months ago

So: I'm happy to close this PR. For other potential users of .NET Docker tools in the future there could be value in adding empty string defaults here, but I'm not sure that's worth the maintenance effort at the moment. Up to you. 🙂

lbussell commented 3 months ago

Great. Thanks for the update!