buchgr / bazel-remote

A remote cache for Bazel
https://bazel.build
Apache License 2.0
595 stars 154 forks source link

Intermittent 403 errors reported by bazel client #584

Closed chris-codaio closed 2 years ago

chris-codaio commented 2 years ago

We're seeing intermittent 403 errors reported by the bazel client; works 99% of the time. We have bazel-remote setup to use htpasswd auth in a kubernetes deployment (single pod).

Flags are:

      --dir=/cache
      --max_size=100
      --s3.key_version=2
      --s3.endpoint=s3.us-west-2.amazonaws.com
      --s3.bucket=<redacted>
      --s3.prefix=bazel-cache
      --s3.auth_method=iam_role
      --htpasswd_file=<redacted>
      --access_log_level=none
      --enable_endpoint_metrics
INFO: Analysis succeeded for only 57 of 58 top-level targets
INFO: Analyzed 58 targets (0 packages loaded, 0 targets configured).
INFO: Found 29 targets and 28 test targets...
WARNING: Remote Cache: 403 Forbidden
<html>
<head><title>403 Forbidden</title></head>
<body>
<center><h1>403 Forbidden</h1></center>
</body>
</html>

com.google.devtools.build.lib.remote.common.BulkTransferException: 403 Forbidden
<html>
<head><title>403 Forbidden</title></head>
<body>
<center><h1>403 Forbidden</h1></center>
</body>
</html>

    at com.google.devtools.build.lib.remote.util.RxUtils$BulkTransferExceptionCollector.onResult(RxUtils.java:91)
    at io.reactivex.rxjava3.internal.operators.flowable.FlowableCollectSingle$CollectSubscriber.onNext(FlowableCollectSingle.java:94)
    at io.reactivex.rxjava3.internal.operators.flowable.FlowableFlatMapSingle$FlatMapSingleSubscriber.innerSuccess(FlowableFlatMapSingle.java:173)
    at io.reactivex.rxjava3.internal.operators.flowable.FlowableFlatMapSingle$FlatMapSingleSubscriber$InnerObserver.onSuccess(FlowableFlatMapSingle.java:342)
    at io.reactivex.rxjava3.internal.operators.single.SingleDoFinally$DoFinallyObserver.onSuccess(SingleDoFinally.java:73)
    at io.reactivex.rxjava3.internal.observers.ResumeSingleObserver.onSuccess(ResumeSingleObserver.java:46)
    at io.reactivex.rxjava3.internal.operators.single.SingleJust.subscribeActual(SingleJust.java:30)
    at io.reactivex.rxjava3.core.Single.subscribe(Single.java:4855)
    at io.reactivex.rxjava3.internal.operators.single.SingleResumeNext$ResumeMainSingleObserver.onError(SingleResumeNext.java:80)
    at io.reactivex.rxjava3.internal.operators.completable.CompletableToSingle$ToSingle.onError(CompletableToSingle.java:73)
    at io.reactivex.rxjava3.internal.operators.completable.CompletableCreate$Emitter.tryOnError(CompletableCreate.java:91)
    at io.reactivex.rxjava3.internal.operators.completable.CompletableCreate$Emitter.onError(CompletableCreate.java:77)
    at com.google.devtools.build.lib.remote.util.RxFutures$OnceCompletableOnSubscribe$1.onFailure(RxFutures.java:102)
    at com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1074)
    at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30)
    at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1213)
    at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:983)
    at com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:771)
    at com.google.devtools.build.lib.remote.util.RxFutures$CompletableFuture.setException(RxFutures.java:276)
    at com.google.devtools.build.lib.remote.util.RxFutures$1.onError(RxFutures.java:210)
    at io.reactivex.rxjava3.internal.operators.completable.CompletableFromSingle$CompletableFromSingleObserver.onError(CompletableFromSingle.java:41)
    at io.reactivex.rxjava3.internal.operators.single.SingleCreate$Emitter.tryOnError(SingleCreate.java:95)
    at io.reactivex.rxjava3.internal.operators.single.SingleCreate$Emitter.onError(SingleCreate.java:81)
    at com.google.devtools.build.lib.remote.util.AsyncTaskCache$1.onError(AsyncTaskCache.java:330)
    at com.google.devtools.build.lib.remote.util.AsyncTaskCache$Execution.onError(AsyncTaskCache.java:198)
    at io.reactivex.rxjava3.internal.operators.completable.CompletableToSingle$ToSingle.onError(CompletableToSingle.java:73)
    at io.reactivex.rxjava3.internal.operators.completable.CompletableCreate$Emitter.tryOnError(CompletableCreate.java:91)
    at io.reactivex.rxjava3.internal.operators.completable.CompletableCreate$Emitter.onError(CompletableCreate.java:77)
    at com.google.devtools.build.lib.remote.util.RxFutures$OnceCompletableOnSubscribe$1.onFailure(RxFutures.java:102)
    at com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1066)
    at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30)
    at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1213)
    at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:983)
    at com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:771)
    at com.google.common.util.concurrent.SettableFuture.setException(SettableFuture.java:53)
    at com.google.devtools.build.lib.remote.http.HttpCacheClient.lambda$uploadAsync$10(HttpCacheClient.java:631)
    at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:578)
    at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:552)
    at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:491)
    at io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:616)
    at io.netty.util.concurrent.DefaultPromise.setFailure0(DefaultPromise.java:609)
    at io.netty.util.concurrent.DefaultPromise.setFailure(DefaultPromise.java:109)
    at io.netty.channel.DefaultChannelPromise.setFailure(DefaultChannelPromise.java:89)
    at com.google.devtools.build.lib.remote.http.HttpUploadHandler.channelRead0(HttpUploadHandler.java:81)
    at com.google.devtools.build.lib.remote.http.HttpUploadHandler.channelRead0(HttpUploadHandler.java:40)
    at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:99)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
    at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
    at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:324)
    at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:296)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
    at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1372)
    at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1235)
    at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1284)
    at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:507)
    at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:446)
    at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:276)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
    at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:286)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
    at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
    at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
    at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:719)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:655)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:581)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)
    at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986)
    at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
    at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
    at java.base/java.lang.Thread.run(Unknown Source)
    Suppressed: com.google.devtools.build.lib.remote.http.HttpException: 403 Forbidden
<html>
<head><title>403 Forbidden</title></head>
<body>
<center><h1>403 Forbidden</h1></center>
</body>
</html>

        ... 41 **more**
mostynb commented 2 years ago

Hi, which version of bazel-remote are you running? And is your bazel client talking to bazel-remote using http or grpc?

chris-codaio commented 2 years ago

Sorry for not including those above

Version: buchgr/bazel-remote-cache:v2.3.9 Using HTTP

mostynb commented 2 years ago

I assume there's a proxy in the picture here somewhere? I have a feeling that the problem may be with a proxy because I can't find any usage of the 403 error code (http.StatusForbidden in go code) in either bazel-remote's source nor in the http basic authentication library that bazel-remote uses (https://github.com/abbot/go-http-auth).

chris-codaio commented 2 years ago

Could be. I'll poke around in our ALB logs and see if there's anything meaningful in there. Will report back.

chris-codaio commented 2 years ago

Looks like an overzealous WAF rule in our AWS configs was indeed the culprit. Kind of unfortunate that the bazel client doesn't tell you which file it's trying to write when the error happens - that would have been really useful to track this down.

mostynb commented 2 years ago

Thanks for following up on this. Perhaps you could report this logging change as a feature request over in https://github.com/bazelbuild/bazel ?

chris-codaio commented 1 year ago

Good call. I've filed a feature request for this here: https://github.com/bazelbuild/bazel/issues/16454