EventStore / replicator

Real-time replication tool
https://replicator.eventstore.org
Apache License 2.0
20 stars 13 forks source link

[DEV-120] Replicator can get stuck due to network issues (despite runContinuously) #81

Closed Lougarou closed 1 year ago

Lougarou commented 1 year ago

Describe the bug Sometimes the replicator does not restart after network issues. For example:

To Reproduce Steps to reproduce the behavior:

  1. Setup two nodes or clusters and the replicator
  2. Introduce network issues. On windows you can use Clumsy. Introduce Drop, Lag, Out of Order Packets and Tampering (corrupt checksum) within a certain percetage.
  3. Let the replicator run for a while
  4. Stop Clumsy
  5. Replicator should resume but it does not
  6. Try again if failed. One of the errors captured which fails it is "The decryption operation failed, see inner exception"

Expected behavior Replicator should restart

Actual behavior Replicator just stops replicating

Config/Logs/Screenshots repl-replicator | {"@t":"2023-09-06T12:44:32.2151840Z","@m":"Reader stopped","@i":"49a9146f","SourceContext":"EventStore.Replicator.Read.ReaderPipe"} repl-replicator | {"@t":"2023-09-06T12:44:32.2153694Z","@m":"Error occured in the \"ReaderContext\" pipe: \"Status(StatusCode=\\"DeadlineExceeded\\", Detail=\\"\\")\"","@i":"c9acd0a5","@l":"Error","@x":"Grpc.Core.RpcException: Status(StatusCode=\"DeadlineExceeded\", Detail=\"\")\n at EventStore.Client.Interceptors.TypedExceptionInterceptor.AsyncStreamReader1.MoveNext(CancellationToken cancellationToken)\n at Grpc.Core.AsyncStreamReaderExtensions.ReadAllAsyncCore[T](IAsyncStreamReader1 streamReader, CancellationToken cancellationToken)+MoveNext()\n at System.Linq.AsyncEnumerable.SelectEnumerableAsyncIterator2.MoveNextCore() in /_/Ix.NET/Source/System.Linq.Async/System/Linq/Operators/Select.cs:line 210\n at System.Linq.AsyncIteratorBase1.MoveNextAsync() in //Ix.NET/Source/System.Linq.Async/System/Linq/AsyncIterator.cs:line 77\n at System.Linq.AsyncIteratorBase`1.MoveNextAsync() in //Ix.NET/Source/System.Linq.Async/System/Linq/AsyncIterator.cs:line 77\n at EventStore.Client.EventStoreClient.ReadInternal(ReadReq request, EventStoreClientOperationOptions operationOptions, UserCredentials userCredentials, CancellationToken cancellationToken)+MoveNext()\n at EventStore.Client.EventStoreClient.ReadInternal(ReadReq request, EventStoreClientOperationOptions operationOptions, UserCredentials userCredentials, CancellationToken cancellationToken)+MoveNext()\n at EventStore.Client.EventStoreClient.ReadAllAsync(Direction direction, Position position, Int64 maxCount, EventStoreClientOperationOptions operationOptions, Boolean resolveLinkTos, UserCredentials userCredentials, CancellationToken cancellationToken)+MoveNext()\n at EventStore.Client.EventStoreClient.ReadAllAsync(Direction direction, Position position, Int64 maxCount, EventStoreClientOperationOptions operationOptions, Boolean resolveLinkTos, UserCredentials userCredentials, CancellationToken cancellationToken)+MoveNext()\n at Ubiquitous.Metrics.Metrics.Measure[T](Func1 action, IHistogramMetric metric, ICountMetric errorCount, String[] labels, Int32 count)\n at EventStore.Replicator.Esdb.Grpc.GrpcEventReader.ReadEvents(Position fromPosition, Func2 next, CancellationToken cancellationToken) in /app/src/EventStore.Replicator.Esdb.Grpc/GrpcEventReader.cs:line 111\n at EventStore.Replicator.Read.ReaderPipe.<>cDisplayClass1_0.<<-ctor>gReader|1>d.MoveNext() in /app/src/EventStore.Replicator/Read/ReaderPipe.cs:line 61\n--- End of stack trace from previous location ---\n at GreenPipes.Filters.AsyncDelegateFilter`1.<>cDisplayClass3_0.<gSendAsync|0>d.MoveNext()\n--- End of stack trace from previous location ---\n at EventStore.Replicator.LoggingFilter1.Send(T context, IPipe1 next) in /app/src/EventStore.Replicator/Logging.cs:line 19","Type":"ReaderContext","Message":"Status(StatusCode=\"DeadlineExceeded\", Detail=\"\")","SourceContext":"EventStore.Replicator.LoggingFilter1[T]"} repl-replicator | {"@t":"2023-09-06T12:44:32.2164463Z","@m":"Error: \"The request was aborted.\", will fail","@i":"02639a3a","@l":"Error","@x":"System.IO.IOException: The request was aborted.\n ---> System.IO.IOException: The decryption operation failed, see inner exception.\n ---> Interop+OpenSsl+SslException: Decrypt failed with OpenSSL error - SSL_ERROR_SSL.\n ---> Interop+Crypto+OpenSslCryptographicException: error:1408F119:SSL routines:ssl3_get_record:decryption failed or bad record mac\n --- End of inner exception stack trace ---\n at Interop.OpenSsl.Decrypt(SafeSslHandle context, Byte[] outBuffer, Int32 offset, Int32 count, SslErrorCode& errorCode)\n at System.Net.Security.SslStreamPal.EncryptDecryptHelper(SafeDeleteContext securityContext, ReadOnlyMemory1 input, Int32 offset, Int32 size, Boolean encrypt, Byte[]& output, Int32& resultSize)\n --- End of inner exception stack trace ---\n at System.Net.Security.SslStream.ReadAsyncInternal[TIOAdapter](TIOAdapter adapter, Memory1 buffer)\n at System.Net.Http.Http2Connection.ProcessIncomingFramesAsync()\n --- End of inner exception stack trace ---\n at System.Net.Http.Http2Connection.ThrowRequestAborted(Exception innerException)\n at System.Net.Http.Http2Connection.Http2Stream.CheckResponseBodyState()\n at System.Net.Http.Http2Connection.Http2Stream.TryReadFromBuffer(Span1 buffer, Boolean partOfSyncRead)\n at System.Net.Http.Http2Connection.Http2Stream.ReadDataAsync(Memory1 buffer, HttpResponseMessage responseMessage, CancellationToken cancellationToken)\n at Grpc.Net.Client.StreamExtensions.ReadMessageAsync[TResponse](Stream responseStream, DefaultDeserializationContext deserializationContext, ILogger logger, Func2 deserializer, String grpcEncoding, Nullable1 maximumMessageSize, Dictionary2 compressionProviders, Boolean singleMessage, CancellationToken cancellationToken)\n at Grpc.Net.Client.Internal.HttpContentClientStreamReader2.MoveNextCore(CancellationToken cancellationToken)\n at EventStore.Client.Interceptors.TypedExceptionInterceptor.AsyncStreamReader1.MoveNext(CancellationToken cancellationToken)\n at Grpc.Core.AsyncStreamReaderExtensions.ReadAllAsyncCore[T](IAsyncStreamReader1 streamReader, CancellationToken cancellationToken)+MoveNext()\n at System.Linq.AsyncEnumerable.SelectEnumerableAsyncIterator2.MoveNextCore() in //Ix.NET/Source/System.Linq.Async/System/Linq/Operators/Select.cs:line 210\n at System.Linq.AsyncIteratorBase`1.MoveNextAsync() in //Ix.NET/Source/System.Linq.Async/System/Linq/AsyncIterator.cs:line 77\n at System.Linq.AsyncIteratorBase1.MoveNextAsync() in /_/Ix.NET/Source/System.Linq.Async/System/Linq/AsyncIterator.cs:line 77\n at EventStore.Client.EventStoreClient.ReadInternal(ReadReq request, EventStoreClientOperationOptions operationOptions, UserCredentials userCredentials, CancellationToken cancellationToken)+MoveNext()\n at EventStore.Client.EventStoreClient.ReadInternal(ReadReq request, EventStoreClientOperationOptions operationOptions, UserCredentials userCredentials, CancellationToken cancellationToken)+MoveNext()\n at EventStore.Client.EventStoreClient.ReadAllAsync(Direction direction, Position position, Int64 maxCount, EventStoreClientOperationOptions operationOptions, Boolean resolveLinkTos, UserCredentials userCredentials, CancellationToken cancellationToken)+MoveNext()\n at EventStore.Client.EventStoreClient.ReadAllAsync(Direction direction, Position position, Int64 maxCount, EventStoreClientOperationOptions operationOptions, Boolean resolveLinkTos, UserCredentials userCredentials, CancellationToken cancellationToken)+MoveNext()\n at EventStore.Client.EventStoreClient.ReadAllAsync(Direction direction, Position position, Int64 maxCount, EventStoreClientOperationOptions operationOptions, Boolean resolveLinkTos, UserCredentials userCredentials, CancellationToken cancellationToken)+System.Threading.Tasks.Sources.IValueTaskSource<System.Boolean>.GetResult()\n at Ubiquitous.Metrics.Metrics.Measure[T](Func1 action, IHistogramMetric metric, ICountMetric errorCount, String[] labels, Int32 count)\n at EventStore.Replicator.Esdb.Grpc.GrpcEventReader.ReadEvents(Position fromPosition, Func2 next, CancellationToken cancellationToken) in /app/src/EventStore.Replicator.Esdb.Grpc/GrpcEventReader.cs:line 111\n at EventStore.Replicator.Read.ReaderPipe.<>c__DisplayClass1_0.<<-ctor>g__Reader|1>d.MoveNext() in /app/src/EventStore.Replicator/Read/ReaderPipe.cs:line 45\n--- End of stack trace from previous location ---\n at GreenPipes.Filters.AsyncDelegateFilter1.<>c__DisplayClass3_0.<g__SendAsync|0>d.MoveNext()\n--- End of stack trace from previous location ---\n at EventStore.Replicator.LoggingFilter1.Send(T context, IPipe1 next) in /app/src/EventStore.Replicator/Logging.cs:line 19\n at GreenPipes.Filters.RetryFilter1.GreenPipes.IFilter<TContext>.Send(TContext context, IPipe1 next)","Error":"The request was aborted.","SourceContext":"EventStore.Replicator.Observers.LoggingRetryObserver"}

EventStore details