gaul / s3proxy

Access other storage backends via the S3 API
Apache License 2.0
1.78k stars 231 forks source link

azure ListBlobs with file after not supported #400

Open hcheng2002cn opened 2 years ago

hcheng2002cn commented 2 years ago

when using s3 proxy connect to azure, our s3 source connector try to get available file list after certain file, got attached exception. After we change to minio azure gateway, same connector work fine. Would you please help to take a look ? The S3 source connector we are using is https://github.com/lensesio/stream-reactor/tree/master/kafka-connect-aws-s3. Thanks for help.

[s3proxy] D 01-16 08:18:51.079 S3Proxy-Jetty-16 o.j.r.i.InvokeHttpMethod:56 |::] >> invoking ListBlobs
2[s3proxy] D 01-16 08:18:51.080 S3Proxy-Jetty-16 jclouds.signature:56 |::] >> GET https://cdcevents.blob.core.windows.net/events?restype=container&comp=list&prefix=restore&marker=restore/0/778.avro&maxresults=1000&include=metadata HTTP/1.1
3[s3proxy] D 01-16 08:18:51.081 S3Proxy-Jetty-16 jclouds.signature:56 |::] >> x-ms-version: 2017-11-09
4[s3proxy] D 01-16 08:18:51.082 S3Proxy-Jetty-16 jclouds.signature:56 |::] >> Date: Sun, 16 Jan 2022 08:18:51 GMT
5[s3proxy] D 01-16 08:18:51.082 S3Proxy-Jetty-16 jclouds.signature:56 |::] >> "GET[\n]"
6[s3proxy] D 01-16 08:18:51.083 S3Proxy-Jetty-16 jclouds.signature:56 |::] >> "[\n]"
7[s3proxy] D 01-16 08:18:51.083 S3Proxy-Jetty-16 jclouds.signature:56 |::] >> "[\n]"
8[s3proxy] D 01-16 08:18:51.084 S3Proxy-Jetty-16 jclouds.signature:56 |::] >> "Sun, 16 Jan 2022 08:18:51 GMT[\n]"
9[s3proxy] D 01-16 08:18:51.086 S3Proxy-Jetty-16 jclouds.signature:56 |::] >> "x-ms-version:2017-11-09[\n]"
10[s3proxy] D 01-16 08:18:51.087 S3Proxy-Jetty-16 jclouds.signature:56 |::] >> "/cdcevents/events?comp=list"
11[s3proxy] D 01-16 08:18:51.089 S3Proxy-Jetty-16 jclouds.signature:56 |::] << "RvpzcK1Iz5Ed9dD5smRHaRQAz76zltRbIYNmkFdc="
12[s3proxy] D 01-16 08:18:51.091 S3Proxy-Jetty-16 jclouds.signature:56 |::] << GET https://cdcevents.blob.core.windows.net/events?restype=container&comp=list&prefix=restore&marker=restore/0/778.avro&maxresults=1000&include=metadata HTTP/1.1
13[s3proxy] D 01-16 08:18:51.091 S3Proxy-Jetty-16 jclouds.signature:56 |::] << x-ms-version: 2017-11-09
14[s3proxy] D 01-16 08:18:51.092 S3Proxy-Jetty-16 jclouds.signature:56 |::] << Date: Sun, 16 Jan 2022 08:18:51 GMT
15[s3proxy] D 01-16 08:18:51.093 S3Proxy-Jetty-16 jclouds.signature:56 |::] << Authorization: SharedKeyLite cdcevents:RvpzcK1Iz5Ed9dD5smRHaRQAz0pv76zltRbIYNmkFdc=
16[s3proxy] D 01-16 08:18:51.093 S3Proxy-Jetty-16 o.j.h.i.JavaUrlHttpCommandExecutorService:56 |::] Sending request -959409936: GET https://cdcevents.blob.core.windows.net/events?restype=container&comp=list&prefix=restore&marker=restore/0/778.avro&maxresults=1000&include=metadata HTTP/1.1
17[s3proxy] D 01-16 08:18:51.093 S3Proxy-Jetty-16 jclouds.headers:56 |::] >> GET https://cdcevents.blob.core.windows.net/events?restype=container&comp=list&prefix=restore&marker=restore/0/778.avro&maxresults=1000&include=metadata HTTP/1.1
18[s3proxy] D 01-16 08:18:51.093 S3Proxy-Jetty-16 jclouds.headers:56 |::] >> x-ms-version: 2017-11-09
19[s3proxy] D 01-16 08:18:51.095 S3Proxy-Jetty-16 jclouds.headers:56 |::] >> Date: Sun, 16 Jan 2022 08:18:51 GMT
20[s3proxy] D 01-16 08:18:51.096 S3Proxy-Jetty-16 jclouds.headers:56 |::] >> Authorization: SharedKeyLite cdcevents:RvpzcK1Iz5Ed9dD5smRHaRQAz0pv76zltRbIYNmkFdc=
21[s3proxy] D 01-16 08:18:51.174 S3Proxy-Jetty-16 o.j.h.i.JavaUrlHttpCommandExecutorService:56 |::] Receiving response -959409936: HTTP/1.1 400 Value for one of the query parameters specified in the request URI is invalid.
22[s3proxy] D 01-16 08:18:51.175 S3Proxy-Jetty-16 jclouds.headers:56 |::] << HTTP/1.1 400 Value for one of the query parameters specified in the request URI is invalid.
23[s3proxy] D 01-16 08:18:51.175 S3Proxy-Jetty-16 jclouds.headers:56 |::] << x-ms-version: 2017-11-09
24[s3proxy] D 01-16 08:18:51.176 S3Proxy-Jetty-16 jclouds.headers:56 |::] << Server: Windows-Azure-Blob/1.0 Microsoft-HTTPAPI/2.0
25[s3proxy] D 01-16 08:18:51.176 S3Proxy-Jetty-16 jclouds.headers:56 |::] << x-ms-error-code: InvalidQueryParameterValue
26[s3proxy] D 01-16 08:18:51.177 S3Proxy-Jetty-16 jclouds.headers:56 |::] << x-ms-request-id: 95be86fa-e01e-0061-6db1-0af8e6000000
27[s3proxy] D 01-16 08:18:51.177 S3Proxy-Jetty-16 jclouds.headers:56 |::] << Date: Sun, 16 Jan 2022 08:18:50 GMT
28[s3proxy] D 01-16 08:18:51.177 S3Proxy-Jetty-16 jclouds.headers:56 |::] << Content-Type: application/xml
29[s3proxy] D 01-16 08:18:51.178 S3Proxy-Jetty-16 jclouds.headers:56 |::] << Content-Length: 423
30[s3proxy] D 01-16 08:18:51.178 S3Proxy-Jetty-16 jclouds.wire:56 |::] << "[0xef][0xbb][0xbf]<?xml version="1.0" encoding="utf-8"?><Error><Code>InvalidQueryParameterValue</Code><Message>Value for one of the query parameters specified in the request URI is invalid.[\n]"
31[s3proxy] D 01-16 08:18:51.179 S3Proxy-Jetty-16 jclouds.wire:56 |::] << "RequestId:95be86fa-e01e-0061-6db1-0af8e6000000[\n]"
32[s3proxy] D 01-16 08:18:51.180 S3Proxy-Jetty-16 jclouds.wire:56 |::] << "Time:2022-01-16T08:18:51.0345598Z</Message><QueryParameterName>marker</QueryParameterName><QueryParameterValue>restore/0/778.avro</QueryParameterValue><Reason>Invalid ListBlobs marker.</Reason></Error>"
33[s3proxy] D 01-16 08:18:51.183 S3Proxy-Jetty-16 o.gaul.s3proxy.S3ProxyHandler:2923 |::] sendSimpleErrorResponse: 400 BadDigest Bad Request {}
gaul commented 2 years ago

I wonder if the &marker=restore/0/778.avro query parameter should be URL-encoded? This would change / to %2F. Can you check the Azure docs? The fix for this would be changing jclouds AzureBlobClient.listBlobs and adding urlEncode = true, similar to apache/jclouds@7ebf12bf3867c7cdead772b8c80675c8a41bf7fa.

hcheng2002cn commented 2 years ago

@gaul thanks, make sense. will test once have time.

gianklug commented 1 year ago

Hey there, what's the status here?

Currently running into the same issue when running s3 sync to an s3proxy instance running on k8s.

gaul commented 1 month ago

Could you try testing with the new azureblob-sdk provider from #606?