Azure / Azurite

A lightweight server clone of Azure Storage that simulates most of the commands supported by it with minimal dependencies
MIT License
1.74k stars 309 forks source link

Batch blob requests fail when multipart boundary contains '=' #2413

Open teamnimbus opened 1 week ago

teamnimbus commented 1 week ago

Which service(blob, file, queue, table) does this issue concern?

Blob

Which version of the Azurite was used?

3.30.0

Where do you get Azurite? (npm, DockerHub, NuGet, Visual Studio Code Extension)

DockerHub

What's the Node.js version?

v20.10.0

What problem was encountered?

When parsing the Content-Type header of a multipart request, the boundary parameter is not correctly identified when it includes the = character. Since the Python SDK uses the = character in boundaries, this breaks all operations for which the SDK generates a batch request -- for example, blob deletion. Note that using the = character as part of the boundary is allowed by RFC-1341.

Steps to reproduce the issue?

Reproducing the issue is a little convoluted since we need to use production-style urls to avoid hitting issue #1809.

Start an Azurite container:

docker run -d --rm -p 10000:10000 --name azurite -e AZURITE_ACCOUNTS=account1:AAAA mcr.microsoft.com/azure-storage/azurite:3.30.0 azurite --blobHost 0.0.0.0 --debug /tmp/azurite-debug.log

Reproducer:

docker run -it --rm --add-host=account1.blob.localhost=127.0.0.1 --net=host python:3.9-bullseye python3 -m asyncio

# The following happens in the Python shell of the container
import sys, subprocess
subprocess.run([sys.executable, '-m', 'pip', 'install', 'azure-storage-blob==12.12.0', 'aiohttp'])

from azure.storage.blob.aio import BlobServiceClient
connection_string = "DefaultEndpointsProtocol=http;AccountName=account1;AccountKey=AAAA;BlobEndpoint=http://account1.blob.localhost:10000;"
blob_service_client = BlobServiceClient.from_connection_string(connection_string)
container_client = await blob_service_client.create_container('testcontainer')
await container_client.upload_blob(name='foo', data='')
await container_client.delete_blobs('foo')

The delete_blobs call fails with:

Traceback (most recent call last):
  File "/usr/local/lib/python3.9/concurrent/futures/_base.py", line 446, in result
    return self.__get_result()
  File "/usr/local/lib/python3.9/concurrent/futures/_base.py", line 391, in __get_result
    raise self._exception
  File "<console>", line 1, in <module>
  File "/usr/local/lib/python3.9/site-packages/azure/core/tracing/decorator_async.py", line 94, in wrapper_use_tracer
    return await func(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/azure/storage/blob/aio/_container_client_async.py", line 1107, in delete_blobs
    return await self._batch_send(*reqs, **options)
  File "/usr/local/lib/python3.9/site-packages/azure/storage/blob/_shared/base_client_async.py", line 167, in _batch_send
    process_storage_error(error)
  File "/usr/local/lib/python3.9/site-packages/azure/storage/blob/_shared/response_handlers.py", line 181, in process_storage_error
    exec("raise error from None")   # pylint: disable=exec-used # nosec
  File "<string>", line 1, in <module>
  File "/usr/local/lib/python3.9/site-packages/azure/storage/blob/_shared/base_client_async.py", line 163, in _batch_send
    raise error
azure.storage.blob._shared.response_handlers.PartialBatchErrorException: There is a partial failure in the batch operation.
ErrorCode:None
Content: --
Content-Type: application/http

HTTP/1.1 400 One of the request inputs is not valid.
x-ms-error-code: InvalidInput
x-ms-request-id: 215d36a6-96e0-47f1-a57e-9af371a0d7b8
content-type: application/xml

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Error>
  <Code>InvalidInput</Code>
  <Message>One of the request inputs is not valid.
RequestId:215d36a6-96e0-47f1-a57e-9af371a0d7b8
Time:2024-06-18T08:53:06.200Z</Message>
</Error>

while the debug log contains:

...
2024-06-18T08:53:06.196Z 215d36a6-96e0-47f1-a57e-9af371a0d7b8 info: BlobStorageContextMiddleware: RequestMethod=POST RequestURL=http://account1.blob.localhost/testcontainer?restype=container&comp=batch RequestHeaders:{"host":"account1.blob.localhost:10000","x-ms-version":"2021-06-08","content-length":"391","content-type":"multipart/mixed; boundary================5306085128869334238==","x-ms-date":"Tue, 18 Jun 2024 08:53:06 GMT","x-ms-client-request-id":"2dde134a-2d50-11ef-9f0f-00155dd86730","user-agent":"azsdk-python-storage-blob/12.12.0 Python/3.9.19 (Linux-5.15.153.1-microsoft-standard-WSL2-x86_64-with-glibc2.31)","authorization":"SharedKey account1:4mORpeBuY9gNV0YyuAzmmK+xK/2Ryo6NkdYX3BXF7dg=","accept":"*/*","accept-encoding":"gzip, deflate"} ClientIP=10.10.0.1 Protocol=http HTTPVersion=1.1
...
2024-06-18T08:53:06.197Z 215d36a6-96e0-47f1-a57e-9af371a0d7b8 verbose: DeserializerMiddleware: Start deserializing...
2024-06-18T08:53:06.198Z 215d36a6-96e0-47f1-a57e-9af371a0d7b8 info: HandlerMiddleware: DeserializedParameters={"options":{"requestId":"2dde134a-2d50-11ef-9f0f-00155dd86730"},"restype":"container","comp":"batch","contentLength":391,"multipartContentType":"multipart/mixed; boundary================5306085128869334238==","version":"2021-06-08","body":"ReadableStream"}
2024-06-18T08:53:06.200Z 215d36a6-96e0-47f1-a57e-9af371a0d7b8 error: BlobBatchHandler: One of the request inputs is not valid.
2024-06-18T08:53:06.201Z         error: ErrorMiddleware: Received a MiddlewareError, fill error information to HTTP response
2024-06-18T08:53:06.201Z         error: ErrorMiddleware: ErrorName=StorageError ErrorMessage=One of the request inputs is not valid.  ErrorHTTPStatusCode=400 ErrorHTTPStatusMessage=One of the request inputs is not valid. ErrorHTTPHeaders={"x-ms-error-code":"InvalidInput","x-ms-request-id":"215d36a6-96e0-47f1-a57e-9af371a0d7b8"} ErrorHTTPBody="<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"yes\"?>\n<Error>\n  <Code>InvalidInput</Code>\n  <Message>One of the request inputs is not valid.\nRequestId:215d36a6-96e0-47f1-a57e-9af371a0d7b8\nTime:2024-06-18T08:53:06.200Z</Message>\n</Error>" ErrorStack="StorageError: One of the request inputs is not valid.\n    at BlobBatchHandler.submitBatch (/opt/azurite/dist/src/blob/handlers/BlobBatchHandler.js:282:25)\n    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)\n    at async ContainerHandler.submitBatch (/opt/azurite/dist/src/blob/handlers/ContainerHandler.js:232:36)"
2024-06-18T08:53:06.201Z         error: ErrorMiddleware: Set HTTP code: 400
2024-06-18T08:53:06.201Z         error: ErrorMiddleware: Set HTTP status message: One of the request inputs is not valid.
2024-06-18T08:53:06.201Z         error: ErrorMiddleware: Set HTTP Header: x-ms-error-code=InvalidInput
2024-06-18T08:53:06.201Z         error: ErrorMiddleware: Set HTTP Header: x-ms-request-id=215d36a6-96e0-47f1-a57e-9af371a0d7b8
2024-06-18T08:53:06.201Z         error: ErrorMiddleware: Set content type: application/xml
2024-06-18T08:53:06.201Z         error: ErrorMiddleware: Set HTTP body: "<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"yes\"?>\n<Error>\n  <Code>InvalidInput</Code>\n  <Message>One of the request inputs is not valid.\nRequestId:215d36a6-96e0-47f1-a57e-9af371a0d7b8\nTime:2024-06-18T08:53:06.200Z</Message>\n</Error>"
2024-06-18T08:53:06.202Z 215d36a6-96e0-47f1-a57e-9af371a0d7b8 verbose: SerializerMiddleware: Start serializing...
2024-06-18T08:53:06.202Z 215d36a6-96e0-47f1-a57e-9af371a0d7b8 info: Serializer: Start returning stream body.
2024-06-18T08:53:06.204Z 215d36a6-96e0-47f1-a57e-9af371a0d7b8 info: EndMiddleware: End response. TotalTimeInMS=8 StatusCode=202 StatusMessage=Accepted Headers={"server":"Azurite-Blob/3.30.0","content-type":"multipart/mixed; boundary=","x-ms-request-id":"215d36a6-96e0-47f1-a57e-9af371a0d7b8","x-ms-version":"2024-05-04"}
...
Full debug log ``` 2024-06-18T08:51:59.691Z info: Azurite Blob service is starting on 0.0.0.0:10000 2024-06-18T08:51:59.693Z info: AccountDataStore:init() Refresh accounts from environment variable AZURITE_ACCOUNTS with value ***** 2024-06-18T08:51:59.705Z info: BlobGCManager:start() Starting BlobGCManager. Set status to Initializing. 2024-06-18T08:51:59.706Z info: BlobGCManager:start() Trigger mark and sweep loop. Set status to Running. 2024-06-18T08:51:59.706Z info: BlobGCManager:markSweepLoop() Start next mark and sweep. 2024-06-18T08:51:59.706Z info: BlobGCManager:markSweep() Get all extents. 2024-06-18T08:51:59.707Z info: BlobGCManager:start() BlobGCManager successfully started. 2024-06-18T08:51:59.709Z info: BlobGCManager:markSweep() Got 0 extents. 2024-06-18T08:51:59.710Z info: BlobGCManager:markSweep() Get referred extents. 2024-06-18T08:51:59.710Z info: BlobGCManager:markSweep() Got referred extents, unreferenced extents count is 0. 2024-06-18T08:51:59.710Z info: BlobGCManager:markSweepLoop() Mark and sweep finished, taken 4ms. 2024-06-18T08:51:59.711Z info: BlobGCManager:markSweepLoop() Sleep for 600000ms. 2024-06-18T08:51:59.712Z info: Azurite Blob service successfully listens on http://0.0.0.0:10000 2024-06-18T08:51:59.713Z info: Azurite Queue service is starting on 127.0.0.1:10001 2024-06-18T08:51:59.713Z info: AccountDataStore:init() Refresh accounts from environment variable AZURITE_ACCOUNTS with value ***** 2024-06-18T08:51:59.718Z info: QueueGCManager:start() Starting QueueGCManager, set status to Initializing 2024-06-18T08:51:59.719Z info: QueueGCManager:start() Trigger mark and sweep loop, set status to Running. 2024-06-18T08:51:59.719Z info: QueueGCManager:markSweepLoop() Start new mark and sweep. 2024-06-18T08:51:59.719Z info: QueueGCManger:markSweep() Get all extents. 2024-06-18T08:51:59.719Z info: QueueGCManager:start() QueueGCManager successfully started. 2024-06-18T08:51:59.719Z info: QueueGCManager:marksweep() Get 0 extents. 2024-06-18T08:51:59.720Z info: QueueGCManager:markSweep() Get referred extents, then remove from allExtents. 2024-06-18T08:51:59.720Z info: QueueGCManager:markSweep() Got referred extents, unreferenced extents count is 0. 2024-06-18T08:51:59.720Z info: QueueGCManager:markSweepLoop() Mark and sweep finished, take 1ms. 2024-06-18T08:51:59.720Z info: QueueGCManager:markSweepLoop() Sleep for 60000 2024-06-18T08:51:59.720Z info: Azurite Queue service successfully listens on http://127.0.0.1:10001 2024-06-18T08:51:59.720Z info: Azurite Table service is starting on 127.0.0.1:10002 2024-06-18T08:51:59.721Z info: AccountDataStore:init() Refresh accounts from environment variable AZURITE_ACCOUNTS with value ***** 2024-06-18T08:51:59.723Z info: Azurite Table service successfully listens on http://127.0.0.1:10002 2024-06-18T08:52:25.866Z 8abe45a0-5e85-4641-9c40-19576e62007f info: BlobStorageContextMiddleware: RequestMethod=PUT RequestURL=http://account1.blob.localhost/testcontainer?restype=container RequestHeaders:{"host":"account1.blob.localhost:10000","x-ms-version":"2021-06-08","accept":"application/xml","x-ms-date":"Tue, 18 Jun 2024 08:52:25 GMT","x-ms-client-request-id":"15d100a0-2d50-11ef-9f0f-00155dd86730","user-agent":"azsdk-python-storage-blob/12.12.0 Python/3.9.19 (Linux-5.15.153.1-microsoft-standard-WSL2-x86_64-with-glibc2.31)","authorization":"SharedKey account1:pEw9gywuJlaZ4mVNbsk3kwoHNiMMLBdHOIkGSLKuDmM=","accept-encoding":"gzip, deflate","content-length":"0"} ClientIP=10.10.0.1 Protocol=http HTTPVersion=1.1 2024-06-18T08:52:25.866Z 8abe45a0-5e85-4641-9c40-19576e62007f info: BlobStorageContextMiddleware: Account=account1 Container=testcontainer Blob= 2024-06-18T08:52:25.867Z 8abe45a0-5e85-4641-9c40-19576e62007f verbose: DispatchMiddleware: Dispatching request... 2024-06-18T08:52:25.869Z 8abe45a0-5e85-4641-9c40-19576e62007f info: DispatchMiddleware: Operation=Container_Create 2024-06-18T08:52:25.870Z 8abe45a0-5e85-4641-9c40-19576e62007f verbose: AuthenticationMiddlewareFactory:createAuthenticationMiddleware() Validating authentications. 2024-06-18T08:52:25.871Z 8abe45a0-5e85-4641-9c40-19576e62007f info: PublicAccessAuthenticator:validate() Start validation against public access. 2024-06-18T08:52:25.871Z 8abe45a0-5e85-4641-9c40-19576e62007f debug: PublicAccessAuthenticator:validate() Getting account properties... 2024-06-18T08:52:25.871Z 8abe45a0-5e85-4641-9c40-19576e62007f debug: PublicAccessAuthenticator:validate() Retrieved account name from context: account1, container: testcontainer, blob: 2024-06-18T08:52:25.875Z 8abe45a0-5e85-4641-9c40-19576e62007f debug: PublicAccessAuthenticator:validate() Skip public access authentication. Cannot get public access type for container testcontainer 2024-06-18T08:52:25.876Z 8abe45a0-5e85-4641-9c40-19576e62007f info: BlobSharedKeyAuthenticator:validate() Start validation against account shared key authentication. 2024-06-18T08:52:25.889Z 8abe45a0-5e85-4641-9c40-19576e62007f info: BlobSharedKeyAuthenticator:validate() [STRING TO SIGN]:"PUT\n\n\n\n\n\n\n\n\n\n\n\nx-ms-client-request-id:15d100a0-2d50-11ef-9f0f-00155dd86730\nx-ms-date:Tue, 18 Jun 2024 08:52:25 GMT\nx-ms-version:2021-06-08\n/account1/testcontainer\nrestype:container" 2024-06-18T08:52:25.890Z 8abe45a0-5e85-4641-9c40-19576e62007f info: BlobSharedKeyAuthenticator:validate() Calculated authentication header based on key1: SharedKey account1:pEw9gywuJlaZ4mVNbsk3kwoHNiMMLBdHOIkGSLKuDmM= 2024-06-18T08:52:25.890Z 8abe45a0-5e85-4641-9c40-19576e62007f info: BlobSharedKeyAuthenticator:validate() Signature 1 matched. 2024-06-18T08:52:25.891Z 8abe45a0-5e85-4641-9c40-19576e62007f verbose: DeserializerMiddleware: Start deserializing... 2024-06-18T08:52:25.893Z 8abe45a0-5e85-4641-9c40-19576e62007f info: HandlerMiddleware: DeserializedParameters={"options":{"metadata":{},"requestId":"15d100a0-2d50-11ef-9f0f-00155dd86730","containerCpkScopeInfo":{}},"restype":"container","version":"2021-06-08"} 2024-06-18T08:52:25.894Z 8abe45a0-5e85-4641-9c40-19576e62007f verbose: SerializerMiddleware: Start serializing... 2024-06-18T08:52:25.895Z 8abe45a0-5e85-4641-9c40-19576e62007f info: EndMiddleware: End response. TotalTimeInMS=29 StatusCode=201 StatusMessage=undefined Headers={"server":"Azurite-Blob/3.30.0","etag":"\"0x245C74B74FA7E40\"","last-modified":"Tue, 18 Jun 2024 08:52:25 GMT","x-ms-client-request-id":"15d100a0-2d50-11ef-9f0f-00155dd86730","x-ms-request-id":"8abe45a0-5e85-4641-9c40-19576e62007f","x-ms-version":"2024-05-04"} 2024-06-18T08:52:51.047Z 2fa88f93-99ce-422e-92d5-3f61edd76765 info: BlobStorageContextMiddleware: RequestMethod=PUT RequestURL=http://account1.blob.localhost/testcontainer/foo RequestHeaders:{"host":"account1.blob.localhost:10000","x-ms-blob-type":"BlockBlob","content-length":"0","if-none-match":"*","x-ms-version":"2021-06-08","content-type":"application/octet-stream","accept":"application/xml","x-ms-date":"Tue, 18 Jun 2024 08:52:51 GMT","x-ms-client-request-id":"24d6288c-2d50-11ef-9f0f-00155dd86730","user-agent":"azsdk-python-storage-blob/12.12.0 Python/3.9.19 (Linux-5.15.153.1-microsoft-standard-WSL2-x86_64-with-glibc2.31)","authorization":"SharedKey account1:+jYccn4bBMDo4SgeoAbukzSHJ53VvzigJjiPpiEv7O8=","accept-encoding":"gzip, deflate"} ClientIP=10.10.0.1 Protocol=http HTTPVersion=1.1 2024-06-18T08:52:51.047Z 2fa88f93-99ce-422e-92d5-3f61edd76765 info: BlobStorageContextMiddleware: Account=account1 Container=testcontainer Blob=foo 2024-06-18T08:52:51.047Z 2fa88f93-99ce-422e-92d5-3f61edd76765 verbose: DispatchMiddleware: Dispatching request... 2024-06-18T08:52:51.049Z 2fa88f93-99ce-422e-92d5-3f61edd76765 info: DispatchMiddleware: Operation=BlockBlob_Upload 2024-06-18T08:52:51.050Z 2fa88f93-99ce-422e-92d5-3f61edd76765 verbose: AuthenticationMiddlewareFactory:createAuthenticationMiddleware() Validating authentications. 2024-06-18T08:52:51.050Z 2fa88f93-99ce-422e-92d5-3f61edd76765 info: PublicAccessAuthenticator:validate() Start validation against public access. 2024-06-18T08:52:51.050Z 2fa88f93-99ce-422e-92d5-3f61edd76765 debug: PublicAccessAuthenticator:validate() Getting account properties... 2024-06-18T08:52:51.050Z 2fa88f93-99ce-422e-92d5-3f61edd76765 debug: PublicAccessAuthenticator:validate() Retrieved account name from context: account1, container: testcontainer, blob: foo 2024-06-18T08:52:51.053Z 2fa88f93-99ce-422e-92d5-3f61edd76765 debug: PublicAccessAuthenticator:validate() Skip public access authentication. Cannot get public access type for container testcontainer 2024-06-18T08:52:51.053Z 2fa88f93-99ce-422e-92d5-3f61edd76765 info: BlobSharedKeyAuthenticator:validate() Start validation against account shared key authentication. 2024-06-18T08:52:51.054Z 2fa88f93-99ce-422e-92d5-3f61edd76765 info: BlobSharedKeyAuthenticator:validate() [STRING TO SIGN]:"PUT\n\n\n\n\napplication/octet-stream\n\n\n\n*\n\n\nx-ms-blob-type:BlockBlob\nx-ms-client-request-id:24d6288c-2d50-11ef-9f0f-00155dd86730\nx-ms-date:Tue, 18 Jun 2024 08:52:51 GMT\nx-ms-version:2021-06-08\n/account1/testcontainer/foo" 2024-06-18T08:52:51.054Z 2fa88f93-99ce-422e-92d5-3f61edd76765 info: BlobSharedKeyAuthenticator:validate() Calculated authentication header based on key1: SharedKey account1:+jYccn4bBMDo4SgeoAbukzSHJ53VvzigJjiPpiEv7O8= 2024-06-18T08:52:51.054Z 2fa88f93-99ce-422e-92d5-3f61edd76765 info: BlobSharedKeyAuthenticator:validate() Signature 1 matched. 2024-06-18T08:52:51.054Z 2fa88f93-99ce-422e-92d5-3f61edd76765 verbose: DeserializerMiddleware: Start deserializing... 2024-06-18T08:52:51.054Z 2fa88f93-99ce-422e-92d5-3f61edd76765 info: HandlerMiddleware: DeserializedParameters={"options":{"metadata":{},"requestId":"24d6288c-2d50-11ef-9f0f-00155dd86730","blobHTTPHeaders":{},"leaseAccessConditions":{},"cpkInfo":{},"cpkScopeInfo":{},"modifiedAccessConditions":{"ifNoneMatch":"*"}},"contentLength":0,"version":"2021-06-08","blobType":"BlockBlob","body":"ReadableStream"} 2024-06-18T08:52:51.056Z 2fa88f93-99ce-422e-92d5-3f61edd76765 debug: OperationQueue.operate() Schedule incoming job 881910da-c7f7-4835-877f-3b299d8e0ace 2024-06-18T08:52:51.056Z 2fa88f93-99ce-422e-92d5-3f61edd76765 debug: OperationQueue:execute() Current runningConcurrency:0 maxConcurrency:50 operations.length:1 2024-06-18T08:52:51.056Z 2fa88f93-99ce-422e-92d5-3f61edd76765 info: FSExtentStore:appendExtent() Select extent from idle location for extent append operation. LocationId:1 extentId:84267ce0-9fb1-418f-b311-7311dececc33 offset:0 MAX_EXTENT_SIZE:67108864 2024-06-18T08:52:51.057Z 2fa88f93-99ce-422e-92d5-3f61edd76765 debug: FSExtentStore:appendExtent() Get fd:undefined for extent:84267ce0-9fb1-418f-b311-7311dececc33 from cache. 2024-06-18T08:52:51.057Z 2fa88f93-99ce-422e-92d5-3f61edd76765 debug: FSExtentStore:appendExtent() Open file:/opt/azurite/__blobstorage__/84267ce0-9fb1-418f-b311-7311dececc33 for extent:84267ce0-9fb1-418f-b311-7311dececc33, get new fd:24 2024-06-18T08:52:51.059Z 2fa88f93-99ce-422e-92d5-3f61edd76765 debug: FSExtentStore:appendExtent() Created write stream for fd:24 2024-06-18T08:52:51.059Z 2fa88f93-99ce-422e-92d5-3f61edd76765 debug: FSExtentStore:appendExtent() Start writing to extent 84267ce0-9fb1-418f-b311-7311dececc33 2024-06-18T08:52:51.060Z 2fa88f93-99ce-422e-92d5-3f61edd76765 debug: FSExtentStore:streamPipe() Start piping data to write stream 2024-06-18T08:52:51.060Z 2fa88f93-99ce-422e-92d5-3f61edd76765 debug: FSExtentStore:streamPipe() Readable stream triggers close event, 0 bytes piped 2024-06-18T08:52:51.060Z 2fa88f93-99ce-422e-92d5-3f61edd76765 debug: FSExtentStore:streamPipe() Invoke write stream end() 2024-06-18T08:52:51.061Z 2fa88f93-99ce-422e-92d5-3f61edd76765 debug: FSExtentStore:streamPipe() Writable stream triggers finish event, after 0 bytes piped. Flush data to fd:24. 2024-06-18T08:52:51.061Z 2fa88f93-99ce-422e-92d5-3f61edd76765 debug: FSExtentStore:streamPipe() Readable stream triggers close event, 0 bytes piped 2024-06-18T08:52:51.064Z 2fa88f93-99ce-422e-92d5-3f61edd76765 debug: FSExtentStore:streamPipe() Flush data to fd:24 successfully. Resolve streamPipe(). 2024-06-18T08:52:51.064Z 2fa88f93-99ce-422e-92d5-3f61edd76765 debug: FSExtentStore:appendExtent() Write finish, start updating extent metadata. extent:{"id":"84267ce0-9fb1-418f-b311-7311dececc33","locationId":"Default","path":"84267ce0-9fb1-418f-b311-7311dececc33","size":0,"lastModifiedInMS":1718700771064} 2024-06-18T08:52:51.065Z 2fa88f93-99ce-422e-92d5-3f61edd76765 debug: FSExtentStore:appendExtent() Update extent metadata done. Resolve() 2024-06-18T08:52:51.065Z 2fa88f93-99ce-422e-92d5-3f61edd76765 debug: OperationQueue.operate() Job 881910da-c7f7-4835-877f-3b299d8e0ace completes callback, resolve. 2024-06-18T08:52:51.066Z 2fa88f93-99ce-422e-92d5-3f61edd76765 debug: OperationQueue:execute() Current runningConcurrency:0 maxConcurrency:50 operations.length:0 2024-06-18T08:52:51.066Z 2fa88f93-99ce-422e-92d5-3f61edd76765 debug: OperationQueue:execute() return. Operation.length === 0 2024-06-18T08:52:51.068Z 2fa88f93-99ce-422e-92d5-3f61edd76765 verbose: SerializerMiddleware: Start serializing... 2024-06-18T08:52:51.069Z 2fa88f93-99ce-422e-92d5-3f61edd76765 info: EndMiddleware: End response. TotalTimeInMS=22 StatusCode=201 StatusMessage=undefined Headers={"server":"Azurite-Blob/3.30.0","etag":"\"0x1E85E3D6248BBC0\"","last-modified":"Tue, 18 Jun 2024 08:52:51 GMT","content-md5":"1B2M2Y8AsgTpgAmY7PhCfg==","x-ms-client-request-id":"24d6288c-2d50-11ef-9f0f-00155dd86730","x-ms-request-id":"2fa88f93-99ce-422e-92d5-3f61edd76765","x-ms-version":"2024-05-04","date":"Tue, 18 Jun 2024 08:52:51 GMT","x-ms-request-server-encrypted":"true"} 2024-06-18T08:52:59.698Z info: AccountDataStore:init() Refresh accounts from environment variable AZURITE_ACCOUNTS with value ***** 2024-06-18T08:52:59.713Z info: AccountDataStore:init() Refresh accounts from environment variable AZURITE_ACCOUNTS with value ***** 2024-06-18T08:52:59.721Z info: QueueGCManager:markSweepLoop() Start new mark and sweep. 2024-06-18T08:52:59.721Z info: QueueGCManger:markSweep() Get all extents. 2024-06-18T08:52:59.722Z info: QueueGCManager:marksweep() Get 0 extents. 2024-06-18T08:52:59.722Z info: QueueGCManager:markSweep() Get referred extents, then remove from allExtents. 2024-06-18T08:52:59.722Z info: QueueGCManager:markSweep() Got referred extents, unreferenced extents count is 0. 2024-06-18T08:52:59.722Z info: QueueGCManager:markSweepLoop() Mark and sweep finished, take 1ms. 2024-06-18T08:52:59.722Z info: QueueGCManager:markSweepLoop() Sleep for 60000 2024-06-18T08:52:59.723Z info: AccountDataStore:init() Refresh accounts from environment variable AZURITE_ACCOUNTS with value ***** 2024-06-18T08:53:06.196Z 215d36a6-96e0-47f1-a57e-9af371a0d7b8 info: BlobStorageContextMiddleware: RequestMethod=POST RequestURL=http://account1.blob.localhost/testcontainer?restype=container&comp=batch RequestHeaders:{"host":"account1.blob.localhost:10000","x-ms-version":"2021-06-08","content-length":"391","content-type":"multipart/mixed; boundary================5306085128869334238==","x-ms-date":"Tue, 18 Jun 2024 08:53:06 GMT","x-ms-client-request-id":"2dde134a-2d50-11ef-9f0f-00155dd86730","user-agent":"azsdk-python-storage-blob/12.12.0 Python/3.9.19 (Linux-5.15.153.1-microsoft-standard-WSL2-x86_64-with-glibc2.31)","authorization":"SharedKey account1:4mORpeBuY9gNV0YyuAzmmK+xK/2Ryo6NkdYX3BXF7dg=","accept":"*/*","accept-encoding":"gzip, deflate"} ClientIP=10.10.0.1 Protocol=http HTTPVersion=1.1 2024-06-18T08:53:06.196Z 215d36a6-96e0-47f1-a57e-9af371a0d7b8 info: BlobStorageContextMiddleware: Account=account1 Container=testcontainer Blob= 2024-06-18T08:53:06.196Z 215d36a6-96e0-47f1-a57e-9af371a0d7b8 verbose: DispatchMiddleware: Dispatching request... 2024-06-18T08:53:06.197Z 215d36a6-96e0-47f1-a57e-9af371a0d7b8 info: DispatchMiddleware: Operation=Container_SubmitBatch 2024-06-18T08:53:06.197Z 215d36a6-96e0-47f1-a57e-9af371a0d7b8 verbose: AuthenticationMiddlewareFactory:createAuthenticationMiddleware() Validating authentications. 2024-06-18T08:53:06.197Z 215d36a6-96e0-47f1-a57e-9af371a0d7b8 info: PublicAccessAuthenticator:validate() Start validation against public access. 2024-06-18T08:53:06.197Z 215d36a6-96e0-47f1-a57e-9af371a0d7b8 debug: PublicAccessAuthenticator:validate() Getting account properties... 2024-06-18T08:53:06.197Z 215d36a6-96e0-47f1-a57e-9af371a0d7b8 debug: PublicAccessAuthenticator:validate() Retrieved account name from context: account1, container: testcontainer, blob: 2024-06-18T08:53:06.197Z 215d36a6-96e0-47f1-a57e-9af371a0d7b8 debug: PublicAccessAuthenticator:validate() Skip public access authentication. Cannot get public access type for container testcontainer 2024-06-18T08:53:06.197Z 215d36a6-96e0-47f1-a57e-9af371a0d7b8 info: BlobSharedKeyAuthenticator:validate() Start validation against account shared key authentication. 2024-06-18T08:53:06.197Z 215d36a6-96e0-47f1-a57e-9af371a0d7b8 info: BlobSharedKeyAuthenticator:validate() [STRING TO SIGN]:"POST\n\n\n391\n\nmultipart/mixed; boundary================5306085128869334238==\n\n\n\n\n\n\nx-ms-client-request-id:2dde134a-2d50-11ef-9f0f-00155dd86730\nx-ms-date:Tue, 18 Jun 2024 08:53:06 GMT\nx-ms-version:2021-06-08\n/account1/testcontainer\ncomp:batch\nrestype:container" 2024-06-18T08:53:06.197Z 215d36a6-96e0-47f1-a57e-9af371a0d7b8 info: BlobSharedKeyAuthenticator:validate() Calculated authentication header based on key1: SharedKey account1:4mORpeBuY9gNV0YyuAzmmK+xK/2Ryo6NkdYX3BXF7dg= 2024-06-18T08:53:06.197Z 215d36a6-96e0-47f1-a57e-9af371a0d7b8 info: BlobSharedKeyAuthenticator:validate() Signature 1 matched. 2024-06-18T08:53:06.197Z 215d36a6-96e0-47f1-a57e-9af371a0d7b8 verbose: DeserializerMiddleware: Start deserializing... 2024-06-18T08:53:06.198Z 215d36a6-96e0-47f1-a57e-9af371a0d7b8 info: HandlerMiddleware: DeserializedParameters={"options":{"requestId":"2dde134a-2d50-11ef-9f0f-00155dd86730"},"restype":"container","comp":"batch","contentLength":391,"multipartContentType":"multipart/mixed; boundary================5306085128869334238==","version":"2021-06-08","body":"ReadableStream"} 2024-06-18T08:53:06.200Z 215d36a6-96e0-47f1-a57e-9af371a0d7b8 error: BlobBatchHandler: One of the request inputs is not valid. 2024-06-18T08:53:06.201Z error: ErrorMiddleware: Received a MiddlewareError, fill error information to HTTP response 2024-06-18T08:53:06.201Z error: ErrorMiddleware: ErrorName=StorageError ErrorMessage=One of the request inputs is not valid. ErrorHTTPStatusCode=400 ErrorHTTPStatusMessage=One of the request inputs is not valid. ErrorHTTPHeaders={"x-ms-error-code":"InvalidInput","x-ms-request-id":"215d36a6-96e0-47f1-a57e-9af371a0d7b8"} ErrorHTTPBody="\n\n InvalidInput\n One of the request inputs is not valid.\nRequestId:215d36a6-96e0-47f1-a57e-9af371a0d7b8\nTime:2024-06-18T08:53:06.200Z\n" ErrorStack="StorageError: One of the request inputs is not valid.\n at BlobBatchHandler.submitBatch (/opt/azurite/dist/src/blob/handlers/BlobBatchHandler.js:282:25)\n at process.processTicksAndRejections (node:internal/process/task_queues:95:5)\n at async ContainerHandler.submitBatch (/opt/azurite/dist/src/blob/handlers/ContainerHandler.js:232:36)" 2024-06-18T08:53:06.201Z error: ErrorMiddleware: Set HTTP code: 400 2024-06-18T08:53:06.201Z error: ErrorMiddleware: Set HTTP status message: One of the request inputs is not valid. 2024-06-18T08:53:06.201Z error: ErrorMiddleware: Set HTTP Header: x-ms-error-code=InvalidInput 2024-06-18T08:53:06.201Z error: ErrorMiddleware: Set HTTP Header: x-ms-request-id=215d36a6-96e0-47f1-a57e-9af371a0d7b8 2024-06-18T08:53:06.201Z error: ErrorMiddleware: Set content type: application/xml 2024-06-18T08:53:06.201Z error: ErrorMiddleware: Set HTTP body: "\n\n InvalidInput\n One of the request inputs is not valid.\nRequestId:215d36a6-96e0-47f1-a57e-9af371a0d7b8\nTime:2024-06-18T08:53:06.200Z\n" 2024-06-18T08:53:06.202Z 215d36a6-96e0-47f1-a57e-9af371a0d7b8 verbose: SerializerMiddleware: Start serializing... 2024-06-18T08:53:06.202Z 215d36a6-96e0-47f1-a57e-9af371a0d7b8 info: Serializer: Start returning stream body. 2024-06-18T08:53:06.204Z 215d36a6-96e0-47f1-a57e-9af371a0d7b8 info: EndMiddleware: End response. TotalTimeInMS=8 StatusCode=202 StatusMessage=Accepted Headers={"server":"Azurite-Blob/3.30.0","content-type":"multipart/mixed; boundary=","x-ms-request-id":"215d36a6-96e0-47f1-a57e-9af371a0d7b8","x-ms-version":"2024-05-04"} ```

Note: we used azure-storage-blob 12.12.0, which is somewhat old, but the same behavior is observed with the latest version (12.20.0).

Have you found a mitigation/solution?

These commands replace the azurite container with a patched one:

docker stop azurite
cat <<EOF | docker build --tag azurite-patched -
FROM mcr.microsoft.com/azure-storage/azurite:3.30.0
RUN sed -i 's/getHeader("content-type").split("=")\[1\]/getHeader("content-type").match(\/boundary=(.*)\/)[1].trim()/g' /opt/azurite/dist/src/blob/handlers/ContainerHandler.js
RUN sed -i 's/getHeader("content-type").split("=")\[1\]/getHeader("content-type").match(\/boundary=(.*)\/)[1].trim()/g' /opt/azurite/dist/src/blob/handlers/ServiceHandler.js
EOF
docker run -d --rm -p 10000:10000 --name azurite -e AZURITE_ACCOUNTS=account1:AAAA azurite-patched azurite --blobHost 0.0.0.0 --debug /tmp/azurite-debug.log

This container makes the "Reproducer" above work, the blob gets deleted without errors.

The patch replaces getHeader("content-type").split("=")[1] with getHeader("content-type").match(/boundary=(.*)/)[1].trim() in the submitBatch functions of both ContainerHandler.js and ServiceHandler.js.

In the source code the faulty lines are:

blueww commented 1 week ago

@EmmaZhu

Would you please help to look at the blob batch related issue?

teamnimbus commented 3 days ago

@EmmaZhu, @blueww: we see the issue is labeled "question" but we weren't asked anything. If you need further info feel free to ask.

blueww commented 2 days ago

@teamnimbus We will change it to "Bug" label after confirm this is an Azurite bug.