Azure / Azurite

A lightweight server clone of Azure Storage that simulates most of the commands supported by it with minimal dependencies
MIT License
1.83k stars 324 forks source link

Error Message 500. Request failed to complete in 15000ms #1154

Open omerkMM opened 3 years ago

omerkMM commented 3 years ago

Which service(blob, file, queue, table) does this issue concern?

Blob

Which version of the Azurite was used?

3.14.0

Where do you get Azurite? (npm, DockerHub, NuGet, Visual Studio Code Extension)

Github

What problem was encountered?

As part of our automation tests. The Azurite is getting in high usage (Around 10-15 requests per second). some of the requests return with HTTP 500 errors.

Steps to reproduce the issue?

Send load of requests in a short time.

Log example from debug.log:

2021-11-02T15:43:21.361Z e51892db-d397-4e58-8978-74c4c86c2838 info: BlobStorageContextMiddleware: RequestMethod=PUT RequestURL=http://azurite-service.qa.svc.cluster.local/devstoreaccount1/qac/propagation_assortments%2F18208_d2839409-a6f1-44a3-992f-fa9102819b70?comp=properties RequestHeaders:{"host":"azurite-service.qa.svc.cluster.local:10000","x-ms-version":"2020-02-10","x-ms-blob-content-type":"application/json","x-ms-blob-content-encoding":"gzip","x-ms-client-request-id":"42a3d8dd-da27-4b56-b6d5-581d5c709abc","x-ms-return-client-request-id":"true","user-agent":"azsdk-net-Storage.Blobs/12.7.0 (.NET Core 3.1.20; Linux 5.4.0-88-generic #99-Ubuntu SMP Thu Sep 23 17:29:00 UTC 2021)","x-ms-date":"Tue, 02 Nov 2021 15:43:21 GMT","authorization":"SharedKey devstoreaccount1:KE9nAPOalOZrzD480tvVfBKUYfLy4MCbvad/Jkgv4Gw=","traceparent":"00-7cd69414f413694997525ee7d74109a6-49bb3c623983634b-00","uber-trace-id":"4819fc6a0c629a34%3A860fb0e737460e0d%3Aa9258b41d44499f7%3A0","content-length":"0"} ClientIP=10.1.142.7 Protocol=http HTTPVersion=1.1
2021-11-02T15:43:21.362Z e51892db-d397-4e58-8978-74c4c86c2838 info: BlobStorageContextMiddleware: Account=devstoreaccount1 Container=qac Blob=propagation_assortments/18208_d2839409-a6f1-44a3-992f-fa9102819b70
2021-11-02T15:43:21.362Z e51892db-d397-4e58-8978-74c4c86c2838 verbose: DispatchMiddleware: Dispatching request...
2021-11-02T15:43:21.362Z e51892db-d397-4e58-8978-74c4c86c2838 info: DispatchMiddleware: Operation=Blob_SetHTTPHeaders
2021-11-02T15:43:21.362Z e51892db-d397-4e58-8978-74c4c86c2838 verbose: AuthenticationMiddlewareFactory:createAuthenticationMiddleware() Validating authentications.
2021-11-02T15:43:21.362Z e51892db-d397-4e58-8978-74c4c86c2838 info: PublicAccessAuthenticator:validate() Start validation against public access.
2021-11-02T15:43:21.362Z e51892db-d397-4e58-8978-74c4c86c2838 debug: PublicAccessAuthenticator:validate() Getting account properties...
2021-11-02T15:43:21.362Z e51892db-d397-4e58-8978-74c4c86c2838 debug: PublicAccessAuthenticator:validate() Retrieved account name from context: devstoreaccount1, container: qac, blob: propagation_assortments/18208_d2839409-a6f1-44a3-992f-fa9102819b70
2021-11-02T15:43:21.399Z e51892db-d397-4e58-8978-74c4c86c2838 debug: PublicAccessAuthenticator:validate() Skip public access authentication. Cannot get public access type for container qac
2021-11-02T15:43:21.399Z e51892db-d397-4e58-8978-74c4c86c2838 info: BlobSharedKeyAuthenticator:validate() Start validation against account shared key authentication.
2021-11-02T15:43:21.400Z e51892db-d397-4e58-8978-74c4c86c2838 info: BlobSharedKeyAuthenticator:validate() [STRING TO SIGN]:"PUT\n\n\n\n\n\n\n\n\n\n\n\nx-ms-blob-content-encoding:gzip\nx-ms-blob-content-type:application/json\nx-ms-client-request-id:42a3d8dd-da27-4b56-b6d5-581d5c709abc\nx-ms-date:Tue, 02 Nov 2021 15:43:21 GMT\nx-ms-return-client-request-id:true\nx-ms-version:2020-02-10\n/devstoreaccount1/devstoreaccount1/qac/propagation_assortments%2F18208_d2839409-a6f1-44a3-992f-fa9102819b70\ncomp:properties"
2021-11-02T15:43:21.400Z e51892db-d397-4e58-8978-74c4c86c2838 info: BlobSharedKeyAuthenticator:validate() Calculated authentication header based on key1: SharedKey devstoreaccount1:KEY
2021-11-02T15:43:21.400Z e51892db-d397-4e58-8978-74c4c86c2838 info: BlobSharedKeyAuthenticator:validate() Signature 1 matched.
2021-11-02T15:43:21.400Z e51892db-d397-4e58-8978-74c4c86c2838 verbose: DeserializerMiddleware: Start deserializing...
2021-11-02T15:43:21.400Z e51892db-d397-4e58-8978-74c4c86c2838 info: HandlerMiddleware: DeserializedParameters={"options":{"requestId":"42a3d8dd-da27-4b56-b6d5-581d5c709abc","blobHTTPHeaders":{"blobContentType":"application/json","blobContentEncoding":"gzip"},"leaseAccessConditions":{},"modifiedAccessConditions":{}},"comp":"properties","version":"2020-02-10"}
2021-11-02T15:43:36.478Z e51892db-d397-4e58-8978-74c4c86c2838 error: ErrorMiddleware: Received an error, fill error information to HTTP response
2021-11-02T15:43:36.478Z e51892db-d397-4e58-8978-74c4c86c2838 error: ErrorMiddleware: ErrorName=SequelizeDatabaseError ErrorMessage=Timeout: Request failed to complete in 15000ms ErrorStack="SequelizeDatabaseError: Timeout: Request failed to complete in 15000ms\n    at Query.formatError (/opt/azurite/node_modules/sequelize/lib/dialects/mssql/query.js:314:12)\n    at Query._run (/opt/azurite/node_modules/sequelize/lib/dialects/mssql/query.js:93:18)\n    at runMicrotasks (<anonymous>)\n    at processTicksAndRejections (internal/process/task_queues.js:95:5)\n    at async /opt/azurite/node_modules/sequelize/lib/sequelize.js:619:16\n    at async MSSqlQueryInterface.bulkUpdate (/opt/azurite/node_modules/sequelize/lib/dialects/abstract/query-interface.js:882:12)\n    at async Function.update (/opt/azurite/node_modules/sequelize/lib/model.js:3188:28)\n    at async /opt/azurite/dist/src/blob/persistence/SqlBlobMetadataStore.js:1360:13\n    at async /opt/azurite/node_modules/sequelize/lib/sequelize.js:1090:24\n    at async BlobHandler.setHTTPHeaders (/opt/azurite/dist/src/blob/handlers/BlobHandler.js:156:19)"
2021-11-02T15:43:36.478Z e51892db-d397-4e58-8978-74c4c86c2838 error: ErrorMiddleware: Set HTTP code: 500
2021-11-02T15:43:36.478Z e51892db-d397-4e58-8978-74c4c86c2838 info: EndMiddleware: End response. TotalTimeInMS=15117 StatusCode=500 StatusMessage=undefined Headers={"server":"Azurite-Blob/3.14.0"}

Notes:

Azurite is running on a single pod. We are using an external DB located in the same network as the Azurite. The issue doesn't happen when running the requests separately. There is no resource usage issue with withe pod/cluster.

blueww commented 3 years ago

@omerkMM

As we see, the error is "ErrorName=SequelizeDatabaseError ErrorMessage=Timeout: Request failed to complete in 15000ms". This looks like the error is caused by slow network or responds from sql. If you looks into the sql server log, you might can find more hint, by see if the request from azurite is received and returned by sql server in time.

Besides that, "10-15 requests per second" looks is not too heavy. Can you still repro the issue with azurite run with loki, instead of sql?

omerkMM commented 3 years ago

Thanks for the reply, it helped We removed the DB connection (which defaults to loki) and things look stable

After further investigation on our end for the root cause of the issue when using a persistent DB connection, we found out the following

Azurite uses sequelize upsert commands Sequelize is translating this command a MERGE INTO statement for the SQL Server MERGE INTO statements is blocking records on the page level so it’s blocking a lot of update statements (that appear on the same page in the DB) This is what’s causing records to be locked for a long time If there are a lot of MERGE INTO statements, they are waiting on each other (each is blocking a whole page and waits for the previous one to finish) A more efficient approach would be to stop using upset and instead to use get and then insert/update

blueww commented 2 years ago

@omerkMM

Good to know the issue resolved on your side with Loki! We will look into the SQL issue you mentioned.

devax commented 2 years ago

I have also tested to use Azurite with an external DB (MSSQL). Idea was/is, to use multiple Azurite instances behind a load balancer to achieve better performance. My finding is, that even when using only one Azurite instance with an external MSSQL DB, I see multiple locks on the DB when Azurite is accessed in parallel (multiple threads reading/writing to Azurite). The locks finally lead to timeouts and in consequence to exceptions being thrown. This renders the external DB feature pretty useless in current state (Azurite 3.15.0) in my opinion. At least with MSSQL.

blueww commented 2 years ago

@devax Thanks for the information! Azurite SQL support is still in preview, we will investigate it when bandwidth available.

However, Azurite welcome contribution! All the SQL related API are called in same file in Azurite https://github.com/Azure/Azurite/blob/main/src/blob/persistence/SqlBlobMetadataStore.ts, like this line. It would be great if you could help to change the SQL call (per @omerkMM 's comment, change the SQL call might can fix it ), and test if it works for your scenario. And if it works for you scenarios, you can raise a PR to fix this.

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

lixaotec commented 1 year ago

+1 im also having this issue