pulumi / pulumi-dotnet

.NET support for Pulumi
Apache License 2.0
27 stars 24 forks source link

Pulumi.LogException during Integration tests #365

Open michalromanowskiap opened 2 weeks ago

michalromanowskiap commented 2 weeks ago

What happened?

Hi,

We are doing xUnit integration tests with Pulumi and we often get Pulumi.LogException:

Pulumi.LogException : Error occurred during logging
   Error Message:
    Pulumi.LogException : Error occurred during logging
 ---- Grpc.Core.RpcException : Status(StatusCode="Unavailable", Detail="Error connecting to subchannel.", DebugException="System.Net.Sockets.SocketException: Connection refused")
 -------- System.Net.Sockets.SocketException : Connection refused
   Stack Trace:
      at Pulumi.Deployment.EngineLogger.LogAsync(LogSeverity severity, String message, Resource resource, Nullable`1 streamId, Nullable`1 ephemeral)
    at Pulumi.Deployment.Runner.LogExceptionToErrorStream(Exception exception)
    at Pulumi.Deployment.Runner.HandleExceptionsAsync(IEnumerable`1 exceptions)
    at Pulumi.Deployment.Runner.WhileRunningAsync()
    at Pulumi.Deployment.<>c__DisplayClass80_0.<<RunInlineAsync>b__1>d.MoveNext()
 --- End of stack trace from previous location ---
    at Pulumi.Automation.WorkspaceStack.UpAsync(UpOptions options, CancellationToken cancellationToken)
    at Pulumi.Automation.WorkspaceStack.UpAsync(UpOptions options, CancellationToken cancellationToken)
    at AllegroPay.IaC.Azure.IntegrationTests.EphemeralInfrastructureFixture.InitializeStackAsync[TStack](String module, String projectName, Action`1 configure, Action`1 action) in /_/tests/AllegroPay.IaC.Azure.IntegrationTests/EphemeralInfrastructureFixture.cs:line 86
    at AllegroPay.IaC.Azure.IntegrationTests.EphemeralInfrastructureFixture.InitializeAsync() in /_/tests/AllegroPay.IaC.Azure.IntegrationTests/EphemeralInfrastructureFixture.cs:line 29
 ----- Inner Stack Trace -----
    at Grpc.Net.Client.Balancer.Internal.ConnectionManager.PickAsync(PickContext context, Boolean waitForReady, CancellationToken cancellationToken)
    at Grpc.Net.Client.Balancer.Internal.BalancerHttpHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
    at Grpc.Net.Client.Internal.GrpcCall`2.RunCall(HttpRequestMessage request, Nullable`1 timeout)
    at Pulumi.GrpcEngine.LogAsync(LogRequest request)
    at Pulumi.Deployment.EngineLogger.LogAsync(LogSeverity severity, String message, Resource resource, Nullable`1 streamId, Nullable`1 ephemeral)
 ----- Inner Stack Trace -----
    at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.ThrowException(SocketError error, CancellationToken cancellationToken)
    at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.System.Threading.Tasks.Sources.IValueTaskSource.GetResult(Int16 token)
    at System.Net.Sockets.Socket.<ConnectAsync>g__WaitForConnectWithCancellation|285_0(AwaitableSocketAsyncEventArgs saea, ValueTask connectTask, CancellationToken cancellationToken)
    at Grpc.Net.Client.Balancer.Internal.SocketConnectivitySubchannelTransport.TryConnectAsync(ConnectContext context, Int32 attempt)

All tests fails at once. I suspect that the SDK is sending log requests to the engine which shuts down prematurely. How to print engine logs?

Example

The error is not persistent. It usually happens in CI pipeline. Our tests are set up as follows:

[Collection(nameof(EphemeralInfrastructureTestCollection))]
public class EphemeralInfrastructureFixtureTests : StackTestFixture
{
    private readonly EphemeralInfrastructureFixture _fixture;

    public EphemeralInfrastructureFixtureTests(EphemeralInfrastructureFixture fixture, ITestOutputHelper output) : base(output)
    {
        _fixture = fixture ?? throw new ArgumentNullException(nameof(fixture));

        _fixture.StackStatus.ForEach(
            stack =>
            {
                output.WriteLine($"{stack.Key} : {Enum.GetName(typeof(UpdateState), stack.Value)}");
            });

        _fixture.StandardOutput.ForEach(
            stack =>
            {
                output.WriteLine($"STDOUT {stack.Key} : {stack.Value}");
            });

        _fixture.StandardError.ForEach(
            stack =>
            {
                output.WriteLine($"STDERR {stack.Key} : {stack.Value}");
            });
    }

    [Fact]
    public void GivenEphemeralNetworkFixture_WhenInitialized_ThenFixtureNameIsNotNull()
    {
        // Assert
        _fixture.Name.Should().NotBeNullOrEmpty();
        Output.WriteLine($"Fixture: {_fixture.Name} created");
    }
}

and we set up the fixture as:

    private async Task InitializeStackAsync<TStack>(
        string module,
        string? projectName = null,
        Action<IConfigurationBuilder>? configure = null,
        Action<IImmutableDictionary<string, OutputValue>>? action = null)
        where TStack : Stack
    {
        var stack = await WorkspaceStackFactory.Create<TStack>(
            module: module,
            stackName: Name,
            projectName: projectName,
            configure: configure);
        var result = await stack.UpAsync();
    }

where WorkspaceStackFactory.Create is:

        <...>
        var program = PulumiFn.Create<TStack>(
            ServiceProviderFactory.Build(
                cb =>
                {
                    configure?.Invoke(cb);
                }));
        var inlineProgramArgs = new InlineProgramArgs(
            projectName: pn,
            stackName: stackName,
            program: program)
        { EnvironmentVariables = Constants.Environment.Concat(envVars ?? new Dictionary<string, string?>()).ToDictionary() };

        var stack = await LocalWorkspace.CreateStackAsync(inlineProgramArgs);
        <...>

Output of pulumi about

CLI          
Version      3.112.0
Go Version   go1.22.1
Go Compiler  gc

Additional context

No response

Contributing

Vote on this issue by adding a 👍 reaction. To contribute a fix for this issue, leave a comment (and link to your pull request, if you've opened one already).

justinvp commented 1 week ago

Your pulumi about output shows 3.112.0 of the CLI is being used. Is this version being used in your CI pipeline? It's quite old (latest version is 3.137.0). Not sure if it'll matter in this case, but do you still hit this when using the latest version of the CLI?