Azure / Azurite

A lightweight server clone of Azure Storage that simulates most of the commands supported by it with minimal dependencies
MIT License
1.83k stars 325 forks source link

Allow "startup commands" #1665

Open dkarlovi opened 2 years ago

dkarlovi commented 2 years ago

Which service(blob, file, queue, table) does this issue concern?

Blob.

Which version of the Azurite was used?

3.19.0

Where do you get Azurite? (npm, DockerHub, NuGet, Visual Studio Code Extension)

DockerHub

What's the Node.js version?

In Docker image.

What problem was encountered?

When trying to set up Azurite for my app's functional tests (app uses Azure blob storage), I need to have a few containers created in the storage. The real containers are typically created with Terraform on deployment, the app itself doesn't create the containers.

It would make sense to be able to do a similar thing with Azurite: being able to have a basic "startup commands" feature where the user can list some pre-requisites and Azurite itself creates them.

Steps to reproduce the issue?

Run Azurite container and use the container thumbnails. The container doesn't exists and I can't make Azurite create it when starting up.

Have you found a mitigation/solution?

Two ways:

  1. the app can use its SDK and connection string to create the required containers, but since it otherwise expects them to be pre-existing, this feature would be built for tests only, messing up the application codebase with unrelated infrastructure code
  2. use an additional sideloaded container with a standalone SDK client (or az CLI tool) to run the commands, this is tricky to coordinate and quite convoluted overall, but seems like the direction which I'll probably take
dkarlovi commented 2 years ago

Full example how to create storage containers with approach 2:


services:
    app:
        image: my/app:version
        depends_on:
            storage_init:
                condition: service_completed_successfully
    # Azure Blob Storage stuff here
    storage:
        image: mcr.microsoft.com/azure-storage/azurite:3.19.0
        # required, see https://github.com/Azure/Azurite/issues/1666
        healthcheck:
            test: nc 127.0.0.1 10000 -z
            interval: 1s
            retries: 30
    storage_init:
        image: mcr.microsoft.com/azure-cli:latest
        command:
            - /bin/sh
            - -c
            - |
                az storage container create --name version
        depends_on:
            storage:
                condition: service_healthy
        environment:
            # https://github.com/Azure/Azurite/blob/main/README.md#usage-with-azure-storage-sdks-or-tools
            AZURE_STORAGE_CONNECTION_STRING: DefaultEndpointsProtocol=http;AccountName=devstoreaccount1;AccountKey=Eby8vdM02xNOcqFlqUwJPLlmEtlCDXJ1OUzFT50uSRZ6IFsuFq2UVErCz4I6tq/K1SZFPTOtr/KBHBeksoGMGw==;BlobEndpoint=http://storage:10000/devstoreaccount1;
blueww commented 2 years ago

@dkarlovi Thanks for raising the issue!

It looks you have already found a way to resolve this.

Feel free to raise if need any further assistance on Azurite.

dkarlovi commented 2 years ago

Hey @blueww, you mistakenly tagged both of my created issues as questions, these are feature suggestions, there's no question asked here.

blueww commented 2 years ago

@dkarlovi

I have re-tag the issue to enhancement.

Azurite currently doesn't support pre-create data when start. You need to add your own code/config to create it. Besides that, if the pre-create test data are same for every test runs, you might first make sure the containers are created, then save the data in the workspace (workspace is specify with -l when start azurite) , and copy the save workspace to a new copy, and start Azurite with the new workspace copy.

dkarlovi commented 2 years ago

@blueww

Azurite currently doesn't support pre-create data when start. You need to add your own code/config to create it.

I understand it doesn't do it now, this is why I've created this feature suggestion.

if the pre-create test data are same for every test runs, you might first make sure the containers are created, then save the data in the workspace (workspace is specify with -l when start azurite) , and copy the save workspace to a new copy, and start Azurite with the new workspace copy.

This is an interesting proposal which I've also considered, but dismissed then because it relied on the workspace storage format being backward / forward compatible, maybe I was wrong. :thinking:

Do you think Azurite could add a feature to facilitate this in some way? Something like "dump workspace / restore workspace" command? Maybe the direction might be to allow the user to deserialize into the workspace, so it the user's dump doesn't need to be exactly the same as the workspace format (avoiding the compatibility issue), it can be a much more simplified format which gets converted into the current storage format by Azurite.

dkarlovi commented 2 years ago

Let's say I have a file like this:

blob:
    containers:
        files:
            access: public
    blobs:
        files/aaa.jpeg: ~

which Azurite takes and applies to its storage on startup, this would mean Azurite's internal storage format can evolve as required and the dump file would still work, as opposed to using the storage file directly.

blueww commented 2 years ago

@dkarlovi

For data structure Azurite used to save blob/containers are much different than the structure of blob/container you can see. So currently we don't see an easy way to do the format transfer to prepare the test data, instead you need call rest API to Azurite to create them.

dimaqq commented 2 years ago

☝🏼 sure; It would still be awesome if a list of containers could be supplied from command line. Another option would be to ingest a tarball on startup. 🙏🏼

The reason is that in production, containers are pre-created by admins (or terraform) and the client (my code) should not attempt to create containers.

Meanwhile in local dev and CI, azurite forces my client code to create containers.

This mismatch forces 2 code paths (test, prod), which almost bit us in production.

oleg-andreyev commented 2 years ago

Absolutely required feature. <3

blueww commented 2 years ago

@dkarlovi, @dimaqq , @oleg-andreyev,

Thanks for the suggestion!

But currently we don't see an easy solution to implement it, since the data structure Azurite used to save blob/containers are much different than the structure of blob/container (or local folder) you can see.

And there are also requests of pre-prepare Azurite data to map different source, like local folder, Azure Blob storage, or a defined structure of container/blob. So there are also no simple way to fulfill all the requests.

Then currently a more proper way to prepare the data is still prepare it with some Azure Tools, like Azcopy, Powershell script ...

If the data need to prepare is always the same for every Azurite instance, an alternative way is you can just prepare it once, and backup the workspace folder of the Azurite. Then start Azurite with a new copy of the workspace backup. Please note, we don't have a commit for the backward compatible of the workspace data structure. (However, in the last 1- years, we don't have a breaking on it.)

dkarlovi commented 2 years ago

@blueww maybe Azurite could bundle the client SDK and call it on itself? Then it's just a matter of configuring the client in some way.

blueww commented 2 years ago

@dkarlovi

Azurite is an Azure storage server emulator, we don't prefer to bundle the client SDK into Azurite for prepare the data, since it will take many issues, such like the package size increase, the dependency issues (dependency breaking, deprecation ....) And it's very possible not all customers prefer to use SDK to generate data.

Currently the preferred way is: customer can prepare the data with some Azure Tools, like Azcopy, Powershell script ...

dkarlovi commented 2 years ago

I understand, it was just an idea to keep the separation clean.

Currently the preferred way is: customer can prepare the data with some Azure Tools, like Azcopy, Powershell script ...

The way I'm reading this is: we don't want to add the feature to support putting Azurite into a known state.

That's a reasonable stance since you get to decide where to spend the resources :+1:, but since you have at least 3 Azure customers in this thread asking about it, the feature aligns with what they want to use Azurite for (testing their apps, which involves putting the storage in a known state in some way), I think it's reasonable to at least consider it seriously before rejecting it.

blueww commented 2 years ago

@dkarlovi Thanks for the suggestion!

We will keep the issue open to track this request.

Azurite welcome contribution. It would be great if you have a good idea to implement this, and raise a PR to add this to Azurite.

Slach commented 1 year ago

@blueww @dkarlovi I tried to implement approach 2 https://github.com/Azure/Azurite/issues/1665#issuecomment-1236796659

but azure CLI got error

PUT /devstoreaccount1/azure-backup-disk?restype=container HTTP/1.1
Host: azure:10000
User-Agent: AZURECLI/2.50.0 (DOCKER) azsdk-python-storage-blob/12.16.0 Python/3.10.12 (Linux-5.15.90.1-microsoft-standard-WSL2-x86_64-with)
Accept-Encoding: gzip, deflate
Accept: application/xml
Connection: keep-alive
x-ms-version: 2022-11-02
x-ms-client-request-id: ae136b24-1ef5-11ee-a175-0242ac160009
CommandName: storage container create
ParameterSetName: --debug --name
x-ms-date: Mon, 10 Jul 2023 07:45:03 GMT
Authorization: SharedKey devstoreaccount1:j7QVZSnHQ9zSPFXLWxHxwucCgSIrlN+xHwECNZMRgUk=
Content-Length: 0

HTTP/1.1 400 The value for one of the HTTP headers is not in the correct format.
Server: Azurite-Blob/3.16.0
x-ms-error-code: InvalidHeaderValue
x-ms-request-id: 3cb15891-4c55-4444-a7c3-7fd79a41aec1
content-type: application/xml
Date: Mon, 10 Jul 2023 07:45:03 GMT
Connection: keep-alive
Keep-Alive: timeout=5
Transfer-Encoding: chunked

160
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Error>
  <Code>InvalidHeaderValue</Code>
  <Message>The value for one of the HTTP headers is not in the correct format.
RequestId:3cb15891-4c55-4444-a7c3-7fd79a41aec1
Time:2023-07-10T07:45:03.141Z</Message>
  <HeaderName>x-ms-version</HeaderName>
  <HeaderValue>2022-11-02</HeaderValue>
</Error>
0

any workaround for it? should i create separate issue?

blueww commented 1 year ago

Answer in https://github.com/Azure/Azurite/issues/2044#issuecomment-1630546800. Looks not use latest Azurite docker image. Please retry after delete local cached Azurite latest image.