Azure / Azurite

A lightweight server clone of Azure Storage that simulates most of the commands supported by it with minimal dependencies
MIT License
1.8k stars 320 forks source link

Azurite occasionally gets very slow performance with multiple containers #719

Open GeeWee opened 3 years ago

GeeWee commented 3 years ago

Which service(blob, file, queue, table) does this issue concern?

Blob

Which version of the Azurite was used?

I have pulled down the latest docker image from mcr.microsoft.com/azure-storage/azurite

Where do you get Azurite? (npm, DockerHub, NuGet, Visual Studio Code Extension)

DockerHub via Docker on Fedora 33.

What's the Node.js version?

I would imagine that doesn't matter for the docker version.

What problem was encountered?

I'm having a hard time pinning down the exact reproduction case, but it goes something like this. Azurite is configured through the following docker-compose block

  azurite:
    image: "mcr.microsoft.com/azure-storage/azurite"
    container_name: aip-azurite
    restart: unless-stopped
    ports:
      - 10000:10000 # Storage
      - 10001:10001 # Queues
    volumes:
    - azurite:/data
    environment:
#      Have azurite pretend to be multiple storage accounts so we can use it for the entire local setup
#      The last storage account is the "default" one for the emulator so anything relying on the default account existing will still work.
#      Zm9v is "foo" base64encoded
      - "AZURITE_ACCOUNTS=stoperational001:Zm9v;stingest001:Zm9v;stcontext001:Zm9v;stcoord001:Zm9v;devstoreaccount1:Eby8vdM02xNOcqFlqUwJPLlmEtlCDXJ1OUzFT50uSRZ6IFsuFq2UVErCz4I6tq/K1SZFPTOtr/KBHBeksoGMGw=="
    command: "azurite --blobHost 0.0.0.0 --queueHost 0.0.0.0 --loose -l /data"

As you can see, there's multiple containers and multiple hosts, and it has a mounted datapath that is a docker volume.

At some point after use (Docker volume shows 500mb+ used), Azurite just starts to.. hang. Response times go from very snappy to multiple seconds, and as a result services start timing out.

I can reproduce this semi-reliably by putting load on two or more containers. You'd think this would just be a normal performance limitation, but even after restarting Azurite, the performance is still so degrade that it's not really usable.

I would love to try to pin down the issue further, but I'm unsure what logging switches I can dial to get some performance characteristics.

Have you found a mitigation/solution?

Clearing docker volume and restarting everything.

blueww commented 3 years ago

@GeeWee

Would you please share the Azurite debug log when the issue happen? And we can see which operation is slowing down.

GeeWee commented 3 years ago

I will. Do you simply want me to collect them with the --debug switch?

blueww commented 3 years ago

@GeeWee

You can collect debug log with --debug parameter. And as you are on Docker, you also need map the debug log folder to a local folder, like : docker run -p 10000:10000 -p 10001:10001 -v C:\workspace:/workspace mcr.microsoft.com/azure-storage/azurite azurite -l /workspace -d /workspace/debug.log --blobHost 0.0.0.0 --queueHost 0.0.0.0,and get debug log in local folder C:\workspace

GeeWee commented 3 years ago

I no longer seem to be able to reproduce this. Sorry for the trouble. Closing.

GeeWee commented 3 years ago

But my colleague has just encountered the same bug. I'm reopening. Even though I seem unable to pin it down. When enabling debug logging the problem seems to go away, so I can't provide any more info unfortunately.

GeeWee commented 3 years ago

We managed to get a debug log with this issue:

The last couple of thousand lines are in this gist: https://gist.github.com/GeeWee/3b01e866fb83af9d03ff48b762c15cdd

Let me know if you need the full log, but it is pretty long (0.5gb)

Some extra information that might be relevant:

XiaoningLiu commented 3 years ago

The attached log only includes traffic to Queue service. Can you please attach the full log or at least ending logs including the request to Blob sersvice?

GeeWee commented 3 years ago

Apologies. I have attached more logs to the following gist

I have also uploaded the full log to OneDrive - note however that it's 250mb so reasonably large.