gaul / s3proxy

Access other storage backends via the S3 API
Apache License 2.0
1.79k stars 233 forks source link

Exceptions thrown from s3proxy when performing Azure blob operation #709

Open nsimha73 opened 3 weeks ago

nsimha73 commented 3 weeks ago

We are using s3proxy from click house where click house database uses its own s3 driver to do data tiering using s3 API talking to s3proxy and s3proxy is configured to use Azure blob storage, we are not able to get any more details from s3proxy but azure storage is logging that delete operation on blob are failing.

S3proxy version: 2.3.0

**** - the below command represents dights/characters

we are running clickhouse and s3proxy in Kubernetes using AKS

[s3proxy] W 10-29 19:43:19.894 S3Proxy-Jetty-27 o.j.a.s.h.AzureStorageClientErrorRetryHandler:74 |::] Cannot retry after server error, command is not replayable: [method=org.jclouds.azureblob.AzureBlobClient.public abstract void org.jclouds.azureblob.AzureBlobClient.putBlock(java.lang.String,java.lang.String,java.lang.String,org.jclouds.io.Payload)[-clickhouse-cold-ch-NNNN-x, --NNNN--*--x/clickhouse-data-0/disks/disk_cold/hsu/rpwczpurapkxcppvybsezuebwkpwb, AAK_IA==, [content=true, contentMetadata=[cacheControl=null, contentDisposition=null, contentEncoding=null, contentLanguage=null, contentLength=17825792, contentMD5=null, contentType=application/unknown, expires=null], written=false, isSensitive=false]], request=PUT https://********************.blob.core.windows.net/**-clickhouse-cold-ch-NNNN-x/**-****-NNNN-****-**-*-x/clickhouse-data-0/disks/disk_cold/hsu/rpwczpurapkxcppvybsezuebwkpwb?comp=block&blockid=AAK_IA%3D%3D HTTP/1.1]

gaul commented 3 weeks ago

Can you run S3Proxy with the -DLOG_LEVEL=trace flag? Information before this log line should clarify the issue. Also you are using the older azureblob provider. You might consider testing the azureblob-sdk provider which is in master (although multipart uploads are currently not implemented).

nsimha73 commented 3 weeks ago

Thank you @gaul for the response, we can't switch to azure-sdk provider as in this case clickhouse uses multipart uploads.

One more question "Do s3proxy keep's any state when it is performing file upload or for that matter any operations it is performing any where ?

after enabling the trace we are seeing the below issues

s3proxy-6bf8b86c4c-ldkbl s3proxy [s3proxy] D 10-31 15:13:04.532 S3Proxy-Jetty-193 o.j.http.internal.HttpWire:56 |::] over limit 535475/262144: wrote temp file                                                                      [110/1967]
s3proxy-6bf8b86c4c-ldkbl s3proxy [s3proxy] D 10-31 15:13:05.035 S3Proxy-Jetty-146 o.g.s.S3ProxyHandlerJetty:91 |::] HttpResponseException without HttpResponse:                                                                               
s3proxy-6bf8b86c4c-ldkbl s3proxy        at com.sun.proxy.$Proxy72.putBlob(Unknown Source)                                                                                                                                                     
s3proxy-6bf8b86c4c-ldkbl s3proxy        at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)                                                                                                                               
s3proxy-6bf8b86c4c-ldkbl s3proxy        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)                                                                                           
s3proxy-6bf8b86c4c-ldkbl s3proxy        at java.base/java.lang.reflect.Method.invoke(Method.java:566)                                                                                                                                         
s3proxy-6bf8b86c4c-ldkbl s3proxy        at com.google.inject.internal.DelegatingInvocationHandler.invoke(DelegatingInvocationHandler.java:50)                                                                                                 
s3proxy-6bf8b86c4c-ldkbl s3proxy        at com.sun.proxy.$Proxy73.putBlob(Unknown Source)                                                                                                                                                     
s3proxy-6bf8b86c4c-ldkbl s3proxy        at org.gaul.s3proxy.S3ProxyHandler.handlePutBlob(S3ProxyHandler.java:2020)                                                                                                                            
s3proxy-6bf8b86c4c-ldkbl s3proxy        at org.gaul.s3proxy.S3ProxyHandler.doHandle(S3ProxyHandler.java:762)                                                                                                                                  
s3proxy-6bf8b86c4c-ldkbl s3proxy        at org.gaul.s3proxy.S3ProxyHandlerJetty.handle(S3ProxyHandlerJetty.java:80)                                                                                                                           
s3proxy-6bf8b86c4c-ldkbl s3proxy        at org.gaul.shaded.org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:122)                                                                                                    
s3proxy-6bf8b86c4c-ldkbl s3proxy        at org.gaul.shaded.org.eclipse.jetty.server.Server.handle(Server.java:563)                                                                                                                            
s3proxy-6bf8b86c4c-ldkbl s3proxy        at org.gaul.shaded.org.eclipse.jetty.server.HttpChannel$RequestDispatchable.dispatch(HttpChannel.java:1598)                                                                                           
s3proxy-6bf8b86c4c-ldkbl s3proxy        at org.gaul.shaded.org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:753)                                                                                                                
s3proxy-6bf8b86c4c-ldkbl s3proxy        at org.gaul.shaded.org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:501)                                                                                                                  
s3proxy-6bf8b86c4c-ldkbl s3proxy        at org.gaul.shaded.org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:287)                                                                                                        
s3proxy-6bf8b86c4c-ldkbl s3proxy        at org.gaul.shaded.org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:314)                                                                                        
s3proxy-6bf8b86c4c-ldkbl s3proxy        at org.gaul.shaded.org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:100)                                                                                                                  
s3proxy-6bf8b86c4c-ldkbl s3proxy        at org.gaul.shaded.org.eclipse.jetty.io.SelectableChannelEndPoint$1.run(SelectableChannelEndPoint.java:53)                                                                                            
s3proxy-6bf8b86c4c-ldkbl s3proxy        at org.gaul.shaded.org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:969)                                                                                                   
s3proxy-6bf8b86c4c-ldkbl s3proxy        at org.gaul.shaded.org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.doRunJob(QueuedThreadPool.java:1194)                                                                                         
s3proxy-6bf8b86c4c-ldkbl s3proxy        at org.gaul.shaded.org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1149)                                                                                              
s3proxy-6bf8b86c4c-ldkbl s3proxy        at java.base/java.lang.Thread.run(Thread.java:829)                                                                                                                                                    
s3proxy-6bf8b86c4c-ldkbl s3proxy Caused by: java.lang.RuntimeException: Error tapping line                                                                                                                                                    
s3proxy-6bf8b86c4c-ldkbl s3proxy        ... 28 common frames omitted                                                                                                                                                                          
s3proxy-6bf8b86c4c-ldkbl s3proxy Caused by: org.gaul.shaded.org.eclipse.jetty.io.EofException: Early EOF                                                                                                                                      
s3proxy-6bf8b86c4c-ldkbl s3proxy        at org.gaul.shaded.org.eclipse.jetty.server.HttpChannelOverHttp.markEarlyEOF(HttpChannelOverHttp.java:287)                                                                                            
s3proxy-6bf8b86c4c-ldkbl s3proxy        at org.gaul.shaded.org.eclipse.jetty.server.HttpChannelOverHttp.earlyEOF(HttpChannelOverHttp.java:266)                                                                                                
s3proxy-6bf8b86c4c-ldkbl s3proxy        at org.gaul.shaded.org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:1634)                                                                                                                  
s3proxy-6bf8b86c4c-ldkbl s3proxy        at org.gaul.shaded.org.eclipse.jetty.server.HttpConnection.parseRequestBuffer(HttpConnection.java:414)                                                                                                
s3proxy-6bf8b86c4c-ldkbl s3proxy        at org.gaul.shaded.org.eclipse.jetty.server.HttpConnection.parseAndFillForContent(HttpConnection.java:348)                                                                                            
s3proxy-6bf8b86c4c-ldkbl s3proxy        at org.gaul.shaded.org.eclipse.jetty.server.HttpChannelOverHttp.parseAndFillForContent(HttpChannelOverHttp.java:129)                                                                                  
s3proxy-6bf8b86c4c-ldkbl s3proxy        at org.gaul.shaded.org.eclipse.jetty.server.HttpChannelOverHttp.produceContent(HttpChannelOverHttp.java:115)                                                                                          
s3proxy-6bf8b86c4c-ldkbl s3proxy        at org.gaul.shaded.org.eclipse.jetty.server.AsyncContentProducer.produceRawContent(AsyncContentProducer.java:470)                                                                                     
s3proxy-6bf8b86c4c-ldkbl s3proxy        at org.gaul.shaded.org.eclipse.jetty.server.AsyncContentProducer.nextTransformedContent(AsyncContentProducer.java:357)                                                                                
s3proxy-6bf8b86c4c-ldkbl s3proxy        at org.gaul.shaded.org.eclipse.jetty.server.AsyncContentProducer.isReady(AsyncContentProducer.java:277)                                                                                               
s3proxy-6bf8b86c4c-ldkbl s3proxy        at org.gaul.shaded.org.eclipse.jetty.server.BlockingContentProducer.nextContent(BlockingContentProducer.java:111)                                                                                     
s3proxy-6bf8b86c4c-ldkbl s3proxy        at org.gaul.shaded.org.eclipse.jetty.server.HttpInput.read(HttpInput.java:287)                                                                                                                        
s3proxy-6bf8b86c4c-ldkbl s3proxy        at org.gaul.shaded.org.eclipse.jetty.server.HttpInput.read(HttpInput.java:272)                                                                                                                        
s3proxy-6bf8b86c4c-ldkbl s3proxy        at java.base/java.io.InputStream.read(InputStream.java:205)   
s3proxy-6bf8b86c4c-ldkbl s3proxy [s3proxy] T 10-31 15:52:17.530 S3Proxy-Jetty-122 o.j.r.i.InvokeHttpMethod:47 |::] << converted DeleteBlob to DELETE https://XXXXXXXXXXXX.blob.core.w/
indows.net/ci-clickhouse-cold-ch-lake-0002-XXDD-XX-D-x-qs01-us-3/ch-lake-0002-XXDD-XX-D-x-qs01-us-3/clickhouse-data-2/disks/disk_cold/xcb/ovaoeafxvmgekkavffffpcpngnrdb HTTP/1.1              
s3proxy-6bf8b86c4c-ldkbl s3proxy [s3proxy] D 10-31 15:52:17.632 S3Proxy-Jetty-194 o.g.s.S3ProxyHandlerJetty:91 |::] HttpResponseException without HttpResponse:                               
s3proxy-6bf8b86c4c-ldkbl s3proxy        at com.sun.proxy.$Proxy72.putBlob(Unknown Source)                                                                                                     
s3proxy-6bf8b86c4c-ldkbl s3proxy        at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)                                                                               
s3proxy-6bf8b86c4c-ldkbl s3proxy        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)                                           
s3proxy-6bf8b86c4c-ldkbl s3proxy        at java.base/java.lang.reflect.Method.invoke(Method.java:566)                                                                                         
s3proxy-6bf8b86c4c-ldkbl s3proxy        at com.google.inject.internal.DelegatingInvocationHandler.invoke(DelegatingInvocationHandler.java:50)                                                 
s3proxy-6bf8b86c4c-ldkbl s3proxy        at com.sun.proxy.$Proxy73.putBlob(Unknown Source)                                                                                                     
s3proxy-6bf8b86c4c-ldkbl s3proxy        at org.gaul.s3proxy.S3ProxyHandler.handlePutBlob(S3ProxyHandler.java:2020)                                                                            
s3proxy-6bf8b86c4c-ldkbl s3proxy        at org.gaul.s3proxy.S3ProxyHandler.doHandle(S3ProxyHandler.java:762)                                                                                  
s3proxy-6bf8b86c4c-ldkbl s3proxy        at org.gaul.s3proxy.S3ProxyHandlerJetty.handle(S3ProxyHandlerJetty.java:80)                                                                           
s3proxy-6bf8b86c4c-ldkbl s3proxy        at org.gaul.shaded.org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:122)                                                    
s3proxy-6bf8b86c4c-ldkbl s3proxy        at org.gaul.shaded.org.eclipse.jetty.server.Server.handle(Server.java:563)                                                                            
s3proxy-6bf8b86c4c-ldkbl s3proxy        at org.gaul.shaded.org.eclipse.jetty.server.HttpChannel$RequestDispatchable.dispatch(HttpChannel.java:1598)
s3proxy-6bf8b86c4c-ldkbl s3proxy        at org.gaul.shaded.org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:753)
s3proxy-6bf8b86c4c-ldkbl s3proxy        at org.gaul.shaded.org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:501)
s3proxy-6bf8b86c4c-ldkbl s3proxy        at org.gaul.shaded.org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:287)
s3proxy-6bf8b86c4c-ldkbl s3proxy        at org.gaul.shaded.org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:314)
s3proxy-6bf8b86c4c-ldkbl s3proxy        at org.gaul.shaded.org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:100)
s3proxy-6bf8b86c4c-ldkbl s3proxy        at org.gaul.shaded.org.eclipse.jetty.io.SelectableChannelEndPoint$1.run(SelectableChannelEndPoint.java:53)
s3proxy-6bf8b86c4c-ldkbl s3proxy        at org.gaul.shaded.org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:969)
s3proxy-6bf8b86c4c-ldkbl s3proxy        at org.gaul.shaded.org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.doRunJob(QueuedThreadPool.java:1194)
s3proxy-6bf8b86c4c-ldkbl s3proxy        at org.gaul.shaded.org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1149)
s3proxy-6bf8b86c4c-ldkbl s3proxy        at java.base/java.lang.Thread.run(Thread.java:829)                                                                                                    
s3proxy-6bf8b86c4c-ldkbl s3proxy Caused by: java.lang.RuntimeException: Error tapping line                                                                                                    
s3proxy-6bf8b86c4c-ldkbl s3proxy        ... 28 common frames omitted                                                                                                                          
s3proxy-6bf8b86c4c-ldkbl s3proxy Caused by: org.gaul.shaded.org.eclipse.jetty.io.EofException: Early EOF                                                                                      
s3proxy-6bf8b86c4c-ldkbl s3proxy        at org.gaul.shaded.org.eclipse.jetty.server.HttpChannelOverHttp.markEarlyEOF(HttpChannelOverHttp.java:287)
s3proxy-6bf8b86c4c-ldkbl s3proxy        at org.gaul.shaded.org.eclipse.jetty.server.HttpChannelOverHttp.earlyEOF(HttpChannelOverHttp.java:266)
s3proxy-6bf8b86c4c-ldkbl s3proxy        at org.gaul.shaded.org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:1634)
s3proxy-84dfc7d954-5qh24 s3proxy [s3proxy] D 10-31 17:45:15.535 S3Proxy-Jetty-177 o.j.h.i.JavaUrlHttpCommandExecutorService:56 |::] Caught a protocol exception on a 100-continue PUT request. Attempting to retry.                           
s3proxy-84dfc7d954-5qh24 s3proxy [s3proxy] W 10-31 17:45:15.535 S3Proxy-Jetty-177 o.j.a.s.h.AzureStorageClientErrorRetryHandler:74 |::] Cannot retry after server error, command is not replayable: [method=org.jclouds.azureblob.AzureBlobCli
ent.public abstract java.lang.String org.jclouds.azureblob.AzureBlobClient.putBlob(java.lang.String,org.jclouds.azureblob.domain.AzureBlob)[ci-clickhouse-cold-ch-lake-DDDD-XXDD-us-3-x-XXDD-us-D, [properties=[name=ch-lake-0002-qc01-us-3-x-
qs01-us-3/clickhouse-data-0/disks/disk_cold/pub/aqsnjebvuhseyviccqlzfedohgpjq, container=null, url=null, contentMetadata=[cacheControl=null, contentDisposition=null, contentEncoding=null, contentLanguage=null, contentLength=1373389, conte
ntMD5=774965f8c19576530d4d65c04bc0373e, contentType=binary/octet-stream, expires=null], eTag=null, lastModified=null, leaseStatus=Unlocked, metadata={}, type=BlockBlob]]], request=PUT https://XXXXXXXXXXXXXXXX.blob.core.windows.net/ci-clickhouse-cold-ch-lake-DDDD-XXDD-us-3-x-XXDD-us-D/ci-clickhouse-cold-ch-lake-DDDD-XXDD-us-3-x-XXDD-us-D/clickhouse-data-0/disks/disk_cold/pub/aqsnjebvuhseyviccqlzfedohgpjq HTTP/1.1]
s3proxy-84dfc7d954-5qh24 s3proxy [s3proxy] D 10-31 17:45:15.535 S3Proxy-Jetty-177 o.g.s.S3ProxyHandlerJetty:91 |::] HttpResponseException without HttpResponse:
s3proxy-84dfc7d954-5qh24 s3proxy org.jclouds.http.HttpResponseException: Server rejected operation connecting to PUT https://XXXXXXXXXXXXXXXX.blob.core.windows.net/ci-clickhouse-cold-ch-lake-DDDD-XXDD-us-3-x-XXDD-us-D/cci-clickhouse-cold-ch-lake-DDDD-XXDD-us-3-x-XXDD-us-D/clickhouse-data-0/disks/disk_cold/pub/aqsnjebvuhseyviccqlzfedohgpjq HTTP/1.1
s3proxy-84dfc7d954-5qh24 s3proxy        at org.jclouds.http.internal.BaseHttpCommandExecutorService.invoke(BaseHttpCommandExecutorService.java:120)
s3proxy-84dfc7d954-5qh24 s3proxy        at org.jclouds.rest.internal.InvokeHttpMethod.invoke(InvokeHttpMethod.java:91)
s3proxy-84dfc7d954-5qh24 s3proxy        at org.jclouds.rest.internal.InvokeHttpMethod.apply(InvokeHttpMethod.java:74)
s3proxy-84dfc7d954-5qh24 s3proxy        at org.jclouds.rest.internal.InvokeHttpMethod.apply(InvokeHttpMethod.java:45)
s3proxy-84dfc7d954-5qh24 s3proxy        at org.jclouds.rest.internal.DelegatesToInvocationFunction.handle(DelegatesToInvocationFunction.java:156)
s3proxy-84dfc7d954-5qh24 s3proxy        at org.jclouds.rest.internal.DelegatesToInvocationFunction.invoke(DelegatesToInvocationFunction.java:123)
s3proxy-84dfc7d954-5qh24 s3proxy        at com.sun.proxy.$Proxy72.putBlob(Unknown Source)
s3proxy-84dfc7d954-5qh24 s3proxy        at org.jclouds.azureblob.blobstore.AzureBlobStore.putBlob(AzureBlobStore.java:240)
gaul commented 3 weeks ago

It looks like the second set of messages is an error in logging itself which is spooling to a file since it is so large. Generally S3Proxy does not use local storage and instead just streams data to and from the server and client. Is it possible that you are using very large user metadata fields that Azure is rejecting? Or some field names that don't comply with the restricted set of characters allowed? Unfortunately there is not much for me to go on here. You could try -DLOG_LEVEL=debug which might avoid the over long logging while still giving some insight into the original problem.

The new azureblob-sdk is definitely the way forward since I won't support azureblob much longer. You might try configuring your client to avoid multipart upload for now to experiment with it but I plan to address this shortcoming in the near term.