Azure / Azurite

A lightweight server clone of Azure Storage that simulates most of the commands supported by it with minimal dependencies
MIT License
1.8k stars 320 forks source link

Access issues using Azurite with Bulk Insert in SQL Server in same Docker Container #1474

Open plooploops opened 2 years ago

plooploops commented 2 years ago

Which service(blob, file, queue, table) does this issue concern?

Blob

Which version of the Azurite was used?

3.17.1

Where do you get Azurite? (npm, DockerHub, NuGet, Visual Studio Code Extension)

npm

What's the Node.js version?

node: v14.19.1 npm: 6.14.16

What problem was encountered?

SQL Bulk Insert isn't working with Azurite: https://docs.microsoft.com/en-us/answers/questions/242043/sql-server-bulk-insert-does-not-work-with-azure-bl.html

Steps to reproduce the issue?

  1. Setup a custom docker container using mcr.microsoft.com/mssql/server:2019-latest as the base image, and install azurite through npm (npm install -g azurite):
FROM mcr.microsoft.com/mssql/server:2019-latest
USER root

# Install node / npm
RUN apt-get -y update  && \
    apt-get install -y curl && \
    curl -sL https://deb.nodesource.com/setup_14.x | bash - && \
    apt-get install -y nodejs && \
    apt-get install -y dos2unix

# Install azurite
RUN npm install -g azurite

# Create app directory
RUN mkdir -p /usr/src/app
RUN mkdir -p /usr/src/app/data
WORKDIR /usr/src/app

# Copy over entrypoint script and ensure it has proper line endings
COPY entrypoint.sh /usr/src/app/
RUN dos2unix *

# Grant permissions
RUN chmod 777 /usr/src/app/entrypoint.sh && \
    chmod 777 -R /usr/src/app

# Expose port for SQL
EXPOSE 1433

# Add Azurite Ports
# Blob Storage Port
EXPOSE 10000
# Queue Storage Port
EXPOSE 10001
# Table Storage Port
EXPOSE 10002

USER mssql
ENTRYPOINT /bin/bash ./entrypoint.sh

Where the contents of entrypoint.sh is:

/opt/mssql/bin/sqlservr & azurite --cert /workspace/cert.pem --key /workspace/key.pem -l /usr/src/app/data --blobHost 0.0.0.0 --queueHost 0.0.0.0 --tableHost 0.0.0.0 --silent --debug debug.log
  1. Generate a self-signed certificate using openssl instructions
mkdir -p azurite
cd azurite
openssl req -newkey rsa:2048 -x509 -nodes -keyout key.pem -new -out cert.pem -sha256 -days 365 -addext "subjectAltName=IP:127.0.0.1" -subj "/C=CO/ST=ST/L=LO/O=OR/OU=OU/CN=CN"
  1. Run the container
docker run -e "ACCEPT_EULA=Y" -e "SA_PASSWORD=My-sql-password." -p 1433:1433 -p 10000:10000 -p 10001:10001 -p 10002:10002 -v $(PWD)/azurite:/workspace --name sql -h sql -d "name-of-custom-sql-with-azurite-container"
  1. Use Azure Storage Explorer to connect to Azurite

  2. Create a blob container (e.g. named containername) in your Azurite Storage Account devstoreaccount1

  3. Upload a CSV file under devstoreaccount1/containername/path/some.csv

  4. Get a SAS token for your Azurite Blob Container using Azure Storage Explorer

  5. Connect to your SQL container and run the following:

    • Bad or inaccessible location (occurs using http)
Error message: Msg 12704 Bad or inaccessible location specified in external data source "DataSourceAzurite"
CREATE MASTER KEY ENCRYPTION BY PASSWORD='<EnterStrongPasswordHere>';

CREATE DATABASE SCOPED CREDENTIAL AzuriteSASCredential
WITH IDENTITY = 'SHARED ACCESS SIGNATURE',
SECRET = '<someSASfromStep7>';

-- Create External Data Source
CREATE EXTERNAL DATA SOURCE DataSourceAzurite
WITH ( TYPE = BLOB_STORAGE,
    LOCATION = 'http://127.0.0.1:10000/devstoreaccount1/containername',
    CREDENTIAL = DSCAzureSqlServerSAS)

-- Get Error message: Msg 12704 Bad or inaccessible location specified in external data source "DataSourceAzurite"
-- some table will have matching columns for some.csv
BULK INSERT some_table
FROM 'path/some.csv'
WITH (DATA_SOURCE = 'DataSourceAzurite',
    FIRSTROW = 2,
    FIELDTERMINATOR = '\t',
    ROWTERMINATOR = '0x0a',
    TABLOCK);

-- Get Error message: Msg 12704 Bad or inaccessible location specified in external data source "DataSourceAzurite"
SELECT * FROM OPENROWSET(
   BULK  'containername/path/some.csv',
   DATA_SOURCE = 'DataSourceAzurite',
   SINGLE_CLOB) AS DataFile;
CREATE MASTER KEY ENCRYPTION BY PASSWORD='<EnterStrongPasswordHere>';

CREATE DATABASE SCOPED CREDENTIAL AzuriteSASCredential
WITH IDENTITY = 'SHARED ACCESS SIGNATURE',
SECRET = '<someSASfromStep7>';

-- Create External Data Source
CREATE EXTERNAL DATA SOURCE DataSourceAzurite
WITH ( TYPE = BLOB_STORAGE,
    LOCATION = 'https://127.0.0.1:10000/devstoreaccount1/containername',
    CREDENTIAL = DSCAzureSqlServerSAS)

-- Get Error message: Msg 12703 Referenced external data source "DataSourceAzurite" not found.
-- some table will have matching columns for some.csv
BULK INSERT some_table
FROM 'path/some.csv'
WITH (DATA_SOURCE = 'DataSourceAzurite',
    FIRSTROW = 2,
    FIELDTERMINATOR = '\t',
    ROWTERMINATOR = '0x0a',
    TABLOCK);

-- Get Error message: Msg 12703 Referenced external data source "DataSourceAzurite" not found.
SELECT * FROM OPENROWSET(
   BULK  'containername/path/some.csv',
   DATA_SOURCE = 'DataSourceAzurite',
   SINGLE_CLOB) AS DataFile;
-- Create a cred to try with account key instead
CREATE DATABASE SCOPED CREDENTIAL AzuriteCred
WITH IDENTITY = 'devstoreaccount1',
    SECRET = 'defaultStorageAccountKey'; -- https://github.com/Azure/Azurite#default-storage-account

CREATE EXTERNAL DATA SOURCE DSAzuriteWithCred
WITH ( TYPE = BLOB_STORAGE,
    LOCATION = 'https://127.0.0.1:10000/devstoreaccount1/containername',
    CREDENTIAL = AzuriteCred)

-- Msg 4861
-- Cannot bulk load because the file "path/some.csv" could not be opened. Operating system error code 86(The specified network password is not correct.).
BULK INSERT some_table
FROM 'path/some.csv'
WITH (DATA_SOURCE = 'DSAzuriteWithCred',
    FIRSTROW = 2,
    FIELDTERMINATOR = '\t',
    ROWTERMINATOR = '0x0a',
    TABLOCK);

-- Msg 4861
-- Cannot bulk load because the file "path/some.csv" could not be opened. Operating system error code 86(The specified network 
SELECT * FROM OPENROWSET(
   BULK  'path/some.csv',
   DATA_SOURCE = 'DataSourceAzurite',
   SINGLE_CLOB) AS DataFile;

If possible, please provide the debug log using the -d parameter, replacing \<pathtodebuglog> with an appropriate path for your OS, or review the instructions for docker containers:

-d "<pathtodebuglog>"

Please be sure to remove any PII or sensitive information before sharing!
The debug log will log raw request headers and bodies, so that we can replay these against Azurite using REST and create tests to validate resolution.

info: AccountDataStore:init() Fallback to default emulator account devstoreaccount1.
info: QueueGCManager:markSweepLoop() Start new mark and sweep.
info: QueueGCManger:markSweep() Get all extents.     
info: QueueGCManager:marksweep() Get 0 extents.      
info: QueueGCManager:markSweep() Get referred extents, then remove from allExtents.
info: QueueGCManager:markSweep() Got referred extents, unreferenced extents count is 0.
info: QueueGCManager:markSweepLoop() Mark and sweep finished, take 0ms.
info: QueueGCManager:markSweepLoop() Sleep for 60000 
info: AccountDataStore:init() Refresh accounts from environment variable AZURITE_ACCOUNTS with value undefined
info: AccountDataStore:init() Fallback to default emulator account devstoreaccount1.

Have you found a mitigation/solution?

Workaround: local file loading

edwin-huber commented 2 years ago

Hi, I missed where / how you created the self signed certificate and used this with your docker image.

  1. Did you use the PFX or PEM?
  2. Did you use a local file, and copy to the docker container, or build the cert inside the container?
  3. If you expose Azurite, are you able to connect to the instance running inside the SQL container?
  4. If you can share the Dockerfile you used for the HTTPS instance, that would be a useful starting point for a repro.

Thanks!

plooploops commented 2 years ago

Updated the flow above (step 2 for using openssl instructions) to account for setting up the self-signed certificate, along with notes around running the docker container and the entrypoint.sh script.

  1. Using PEM
  2. Volume mount the folder with the PEM files
  3. Yes, you can use Azure Storage Explorer and import the self-signed certificate
  4. The Dockerfile used is included in the notes above, but including it here for reference:
FROM mcr.microsoft.com/mssql/server:2019-latest
USER root

# Install node / npm
RUN apt-get -y update  && \
    apt-get install -y curl && \
    curl -sL https://deb.nodesource.com/setup_14.x | bash - && \
    apt-get install -y nodejs && \
    apt-get install -y dos2unix

# Install azurite
RUN npm install -g azurite

# Create app directory
RUN mkdir -p /usr/src/app
RUN mkdir -p /usr/src/app/data
WORKDIR /usr/src/app

# Copy over entrypoint script and ensure it has proper line endings
COPY entrypoint.sh /usr/src/app/
RUN dos2unix *

# Grant permissions
RUN chmod 777 /usr/src/app/entrypoint.sh && \
    chmod 777 -R /usr/src/app

# Expose port for SQL
EXPOSE 1433

# Add Azurite Ports
# Blob Storage Port
EXPOSE 10000
# Queue Storage Port
EXPOSE 10001
# Table Storage Port
EXPOSE 10002

USER mssql
ENTRYPOINT /bin/bash ./entrypoint.sh

And entrypoint.sh is:

/opt/mssql/bin/sqlservr & azurite --cert /workspace/cert.pem --key /workspace/key.pem -l /usr/src/app/data --blobHost 0.0.0.0 --queueHost 0.0.0.0 --tableHost 0.0.0.0 --silent --debug debug.log
ShaneCastle-CL commented 5 months ago

Are there any updates on this issue? I'm hitting the same error following the same procedure.

Msg 12704, Level 16, State 1, Line 26 Bad or inaccessible location specified in external data source "AzuriteDataSource"