Azure / kusto-copy

Tool enabling Azure Data Explorer / Kusto cluster replication
MIT License
7 stars 3 forks source link

Example of the lake connection string #11

Open tojuasp opened 1 year ago

tojuasp commented 1 year ago

Hi!

We are attempting to migrate multiple ADX cluster's data to new environments. This tool popped in the Google and looked like should be worth to try.

The lake connection string however is currently causing some issues. It seems like it is not any kind of standard storage connection string... I tried to look at the code as well but don't also want to spend too much time just looking at an issue when I can try to ask :)

Could you give an example of the lake connection string that should work?

AleMiguelMicrosoft commented 1 year ago

These instructions will connect to the source and destination clusters using the specified service principal credentials for authentication.

To use a service principal for authentication, follow these step-by-step instructions to set up the connection string and resources using the Azure portal:

Create a service principal:
a. Go to the Azure portal (https://portal.azure.com/).
b. Search for "Azure Active Directory" and click on it.
c. In the left menu, click on "App registrations" and then click on "+ New registration."
d. Provide a name for your application, and leave the other settings as default. Click "Register."
e. After the registration is complete, you will see your new application's details. Note down the "Application (client) ID" and "Directory (tenant) ID."

Create a secret for the service principal:
a. In your newly created application, click on "Certificates & secrets" in the left menu.
b. Click on "+ New client secret."
c. Provide a description, choose an expiration period, and click "Add."
d. Note down the "Value" of the newly created secret. This is your "applicationKey," and you won't be able to see it again.

Grant your service principal access to the Kusto cluster:
a. In the Azure portal, navigate to your Kusto cluster.
b. Click on "Access control (IAM)" in the left menu.
c. Click on "+ Add" and then "Add role assignment."
d. Select a role, such as "Contributor" or "Reader," and search for your service principal's name in the "Select" field. Choose your service principal and click "Save."

To include the connection string with the AppClientId, AppKey, and Authority Id in the command line, you need to replace the source and destination cluster URLs with their corresponding Kusto-style connection strings. Here's an example:

kusto-copy -l https://{YOURSTORAGEACCT}.blob.core.windows.net/{YOURBLOBSTORAGECONTAINER}/checkpoints -s "Data Source=https://{YOURSOURCEADX}.{YOURSOURCEADXREGION}.kusto.windows.net/;Database={YOURSOURCEADXDB};Fed=True;AppClientId={sourceAppClientId};AppKey={sourceAppKey};Authority Id={sourceAuthorityId}" -d "Data Source=https://{YOURDESTADX}.{YOURDESTADXREGION}.kusto.windows.net/;Database={YOURDESTADXDB};Fed=True;AppClientId={destinationAppClientId};AppKey={destinationAppKey};Authority Id={destinationAuthorityId}" --db {YOURDESTADXDB} --tables-include {LIST OF TABLES TO COPY}

Replace {sourceAppClientId}, {sourceAppKey}, {sourceAuthorityId}, {destinationAppClientId}, {destinationAppKey}, and {destinationAuthorityId} with the respective values for your source and destination clusters.

To get the values for {sourceAppClientId}, {sourceAppKey}, {sourceAuthorityId}, {destinationAppClientId}, {destinationAppKey}, and {destinationAuthorityId}, you need to create and configure two Azure Active Directory (AAD) applications, one for the source cluster and one for the destination cluster.

Follow these steps to create an AAD application and get the required values:

  1. Sign in to the Azure portal.
  2. Navigate to "Azure Active Directory" from the left-hand menu.
  3. Click on "App registrations" and then "New registration".
  4. Provide a name for the application, and then click "Register".
  5. Once the application is created, note down the "Application (client) ID" and "Directory (tenant) ID" from the application's "Overview" page. The "Application (client) ID" corresponds to the {AppClientId} value, while the "Directory (tenant) ID" corresponds to the {AuthorityId} value.
  6. Click on "Certificates & secrets" in the left-hand menu, then click "New client secret" in the "Client secrets" section. Provide a description and an expiration time for the client secret, and then click "Add".
  7. After the client secret is created, note down its "Value" as it will not be visible again. This "Value" corresponds to the {AppKey} value.

Repeat the above steps for both the source and destination clusters to get their respective values.

Now, replace {sourceAppClientId}, {sourceAppKey}, {sourceAuthorityId}, {destinationAppClientId}, {destinationAppKey}, and {destinationAuthorityId} in your Kusto Copy command line with the values you've obtained from the Azure portal.

Keep in mind that you should also grant the necessary permissions to the AAD applications on the source and destination Kusto clusters and the storage account, as explained in the previous instructions.

Since you want to grant access to your application using its Application (Client) ID, you should choose the first option: "User, group, or service principal". This option allows you to assign the "Storage Blob Data Contributor" role to the service principal associated with your application.

Here's a recap of the steps:

  1. In the "Add role assignment" pane, select the "Storage Blob Data Contributor" role.
  2. Choose "User, group, or service principal" as the assignment type.
  3. In the "Select" field, search for your application by its Application (Client) ID or name.
  4. When you find the correct application, click on it to select it.
  5. Click on the "Save" button to assign the selected role to your application's service principal.
  6. Once you've completed these steps, your application should have the necessary permissions to access the Azure Blob Storage using its Application (Client) ID and secret.