Azure / Azurite

A lightweight server clone of Azure Storage that simulates most of the commands supported by it with minimal dependencies
MIT License
1.8k stars 320 forks source link

Frequent "internal server error" in CloudQueue.GetMessage() / GetMessageAsync() #347

Closed ManfredLange closed 4 years ago

ManfredLange commented 4 years ago

Which service(blob, file, queue, table) does this issue concern?

queue

Which version of the Azurite was used?

V3 via latest docker container image, sha256:10f72c1ff978851171b3130371a6885beb888c4ce5691b49aa305968fa1404e9

Where do you get Azurite? (npm, DockerHub, NuGet, Visual Studio Code Extension)

DockerHub using docker pull mcr.microsoft.com/azure-storage/azurite

What's the Node.js version?

Whatever is in that image

What problem was encountered?

Calling CloudQueue.GetMessage() or CloudQueue.GetMessageAsync() fails with execption "Internal Server Error" with no further details.

Steps to reproduce the issue?

Just put enough load on it, e.g. using multiple threads and/or multiple processes.

Have you found a mitigation/solution?

Restarting the emulator seems to be the only thing that gets the emulator unstuck.

-- It works fine for a while. Then after running a few thousand tests that hammer the emulator from multiple processes and threads, it seems that once there is the first internal server error, restarting is the only option to get it working again.

It might also help to share this observation: Once the queue api is stuffed, the blob api still appears to be operational.

Update: It appears as if this is only an issue when running Azurite from the container. Running the most recent commit - df5bf1d - locally or even with a debugger doesn't show the same problem.

Could this be a case of the docker image not having been updated with most recent commit? I see that the most recent image was created on 11 Nov 2019, 1440 Z. The most recent commits are dated 12 Nov 2019, 0332 GMT (+13).

Alternatively this could also be a case of an alpine-based image misbehaving on a Windows host. I've seen this in the past with other containers but have no evidence in this case at this time.

I intend to try setting up the debugger in a container to see if I can reproduce the observed problem.

XiaoningLiu commented 4 years ago

@ManfredLange I recently fixed a race condition reported by this issue https://github.com/Azure/Azurite/issues/288. Please try with latest dev branch commit to see if it resolve your issue.

ManfredLange commented 4 years ago

Thank you, @XiaoningLiu !

What I tried is this: I set up a dev container, then took the latest commit from master. I also added .gitattributes to ensure line endings are chosen correctly (Windows as host, Linux in Docker), and I also switched to a non-alpine base image. Furthermore locally I added .dockerignore to ensure only required folders/files are used when building the docker image.

What I observe now is this:

Next I switched back to the alpine-based image and rebuilt the Azurite container. The size was like 3 times compared to mcr.microsoft.com/azure-storage/azurite (750 MB vs 250 MB). I didn't investigate what causes the size difference. Using the same npm packages and the same Dockerfile I would expect the size fairly similar. Interestingly, running this container didn't show any of the problems.

Consequentially, I then switched back to the image mcr.microsoft.com/azure-storage/azurite. I've now run this multiple times. For some reason I haven't seen again the orginial problem I reported.

Bottom line, I think we can close this issue. However, I'd suggest the following improvements for the code base:

  1. Add .gitattributes file to ensure correct line endings regardless of the host operating system. This change makes it easier for the community to contribute.
  2. Add .dockerignore file to ensure only the minimal set of files are sent to the docker engine to create the docker image.
  3. Replace npm install with npm ci in the Dockerfile. npm install is not deterministic. Depending on the point in time, the version of indirect dependencies may be updated resulting in a different docker image even for the same commit. Generally, you want the exact same versions of all npm packages and you want to lock that down. You already do that by using and version packag-lock.json. npm ci will use exactly the versions as recorded in package-lock.json. It will not change that file. npm install on the other will update package-lock.json. You don't want that if you want predictable builds.

These recommendations for improvement are derived from my experience working with a many number of clients. I intend to create a pull request (PR) for just those changes. The decision is then with you.