github / gh-gei

Migration CLI for GitHub to GitHub migrations
MIT License
320 stars 87 forks source link

Add support for AAD auth to Azure Storage #673

Open dylan-smith opened 2 years ago

dylan-smith commented 2 years ago

When migrating from GHES -> GHEC we use Azure Storage as an intermediate storage location for the migration archives. The way we authenticate to Azure Storage today is we require the user to provide an Azure Storage Connection String which includes a shared key.

Some customers would prefer that we use an AAD identity to authenticate instead of a shared key to align with their security policies. We should be able to add support to the CLI to accept a AAD service principal client ID and secret and use that to auth to the storage account.

timrogers commented 2 years ago

@CSMonkee Thanks for the follow up on this! We don't have immediate plans to add support for AAD authentication in the CLI. We recognise the value, but we don't have capacity right now. Is that what the customer is looking for still? I'm not sure I understand the diagram.

If it isn't going to be possible for the customer to use a shared key, as is supported today, then there is a possible workaround where the customer generates the migration archives manually with ghe-migrator, uploads them to blob storage as they wish and then uses the CLI to actually start the migration. If that might work, I'd be happy to share more details.

danielmeppiel commented 1 year ago

@timrogers , does that mean that using a User Delegation SAS would be technically feasible? In other words:

  1. Generate the migration archives manually with ghe-migrator
  2. Upload them to Azure blob storage of your choice, in the preferred infrastructure and network.
  3. Create pre-signed URLs to the archive using Azure's User Delegation SAS.
  4. Use the GEI CLI to actually start the migration, importing the archive into GitHub.com from the above pre-signed URL that we would pass to GEI CLI. This would entail using the command: --git-archive-url GIT_SOURCE_ARCHIVE_URL --metadata-archive-url METADATA_ARCHIVE_URL

It would be great to have your opinion on the above, as this would unblock the migration path.

I also want to add more color to the motivation behind this feature request: When placing highly restricted sensitive code in storage accessible from the Internet, the security constraints imposed by large enterprises are typically tight - firewalls/IP allowlists are often a must, and using a shared key as an authentication method is deemed risky.

Thank you for clarifying!

timrogers commented 1 year ago

@danielmeppiel Yes - that should work, as long as the end of result of step 3 is a URL which GitHub can access.