Azure / bicep

Bicep is a declarative language for describing and deploying Azure resources
MIT License
3.22k stars 746 forks source link

Registry - Potential Git-based user experience #3556

Closed majastrz closed 3 years ago

majastrz commented 3 years ago

Goals

This is intended to provide an approximation of the Bicep registry experience IF we choose Git as the mechanism for sharing Bicep modules. The existence of this issue DOES NOT indicate that Git was selected as the implementation of the Bicep Registry. (See #2128 for details about other candidates.)

Q: Do we support other version control systems? (No?)

Gallery Experience

Git on its own does not provide any gallery-like experience. However, hosted Git repos typically appear in search results and markdown files are rendered as HTML when you browse through repos.

Reference module from an artifact

To reference a module from a Git repo, the user opens a new or existing Bicep file in VS code and types one or more declarations like the following:

GitHub

// single module per repo
module mod 'github:ExampleOrg/example-repo@v0.4' = {
  ...
}

// one module in the repo out of many
module mod2 'github:ExampleOrg/example-repo@v0.4/module-name' = {
  ...
}

The GitHub module reference has the following components:

  1. github - scheme indicating that the remainder of the string is a reference to a module in a GitHub repo.
  2. ExampleOrg - a GitHub organization name
  3. example-repo - the repo name within the organization
  4. v0.4 - valid Git ref pointing to a tag, branch, or a commit ID. This is conceptually similar to a module version.
  5. module-name - the name of the module in the repo at the specified Git ref.

Q: The "version" component can be placed at the end of the string at the cost of reduced completion quality.

Q: Do Bicep files need to be repo URI-agnostic. Is there a need to deploy unmodified Bicep modules that point to mirror repos in environments with restricted networking?

Q: Do we need to integrate with the GitHub accounts extension in VS code? GitHub exposes APIs that can be used to power completions. However, anynomous calls are aggressively throttled. This can be mitigated by making authenticated calls and is required for private orgs and private repos regardless.

ADO

// single module per repo
module mod 'ado:exampleorg/ExampleProject/ExampleRepo@v0.4' = {
  ...
}

// one module in the repo out of many
module mod 'ado:exampleorg/ExampleProject/ExampleRepo@v0.4/module-name' = {
  ...
}

The ADO module reference has the following components:

  1. ado - scheme indicating that the remainder of the string is a reference to a module in a Azure DevOps repo.
  2. exampleorg - an ADO organization name (also known as account name)
  3. ExampleProject - the name of the project within the organization
  4. ExampleRepo - the repo name within the project
  5. v0.4 - valid Git ref pointing to a tag, branch, or a commit ID. This is conceptually similar to a module version.
  6. module-name - the name of the module in the repo at the specified Git ref.

Other Git providers or SSH

For scenarios involving other Git hosting providers, we can fall back to specifying a direct clone URI. This is also useful with SSH authentication to any repo including GitHub or ADO hosted repos.

// single module per repo
module mod 'git:<clone url>@v0.4' = {
  ...
}

// one module in the repo out of many
module mod 'git:<clone url>@v0.4/module-name' = {
  ...
}

Example clone URLs:

VS code experience

We should be able to provide reliable completions for module paths once the repo has been cloned to the local file system. Completions for tags and branch names should also be fairly reliable. For the git scheme, the remaining components are URIs and therefore cannot be auto completed. For the github and ado schemes, we may be able to use the corresponding APIs to enumerate each component of the references but it will require additional complexity.

If the current module has any external module references, the language server queues up the module restore operation. If the repo is not present in the local file system, it will be cloned.

The background restore operation does not happen instantly. Until the repo is cloned to the local cache, accurate type information is not available:

If/when restore fails:

Versioning

Git makes no immutability guarantees whatsoever:

The enforcement of any versioning rules in any repo is the responsibility of the repo maintainers.

Q: Do we allow references to main or master? (main or master are like any other ref, so we should.)

CLI

bicep build

Since Bicep modules can contain references to modules that exist in other Git repositories, the module contents may not exist on the local system. If the Bicep file contains references to external modules, the bicep build command pulls the referenced artifacts before type checking and code generation and stores them on the local file system. (In the compiler pipeline, the pull step occurs after parsing but before type checking is done.)

By default, the module cache is located in the .bicep directory under the current working dir. If the module being built via bicep build is also in a Git repo, the .bicep directory must be added to .gitignore.

Q: Should we support a central path here similar to NuGet? The trade-off is more file I/O concurrency with the benefit of more disk space reuse.

bicep restore

In certain usecases (Docker and some CI systems), the restore and build operations need to be separated. This can be accomplished as follows:

  1. bicep pull
  2. bicep build --no-restore main.bicep

Git CLI dependency

The module restore process uses the git clone command to clone the Git repos. The git CLI is not included with the Bicep and must be installed on the local machine and present in the PATH.

If module restore from a Git repo is attempted on a machine that does not have Git CLI installed, the restore operation will fail and the error messages will point out the missing dependency. The language server will remain fully functional in this state.

Q: Do we need an env var to point to the Git CLI location? Is there an existing one we could use?

Private repos

All of the major Git hosting providers have private repo offerings including standalone on-premises implementations for isolated/restricted networking scenarios. Git itself provides various options to help with credential management and auth to repositories.

Local repos

A local Git repo can be created in the local file system via the git init command. However, without pushing to a remote, the local repo will not be resilient to local storage failures. Given the ease with which private or public repos can be created on GitHub, there isn't a lot of value in supporting local repos. (Can revisit if user feedback indicates otherwise.)

If need be, we can support the following style of references for local repos:

module mod 'git:file://<local path>@v0.4/module-name' = {
  ...
}

Repo directory structure

One module per repo

📦root
 ┣ 📜main.bicep
 ┣ 📜main.json
 ┣ 📜metadata.json
 ┗ 📜README.md

Multiple moduels per repo

📦multi
 ┣ 📂module1
 ┃ ┣ 📜main.bicep
 ┃ ┣ 📜main.json
 ┃ ┣ 📜metadata.json
 ┃ ┗ 📜README.md
 ┗ 📂module2
 ┃ ┣ 📂local-modules
 ┃ ┃ ┗ 📜somemodule.bicep
 ┃ ┣ 📜main.bicep
 ┃ ┣ 📜main.json
 ┃ ┣ 📜metadata.json
 ┃ ┗ 📜README.md

Deeper folder hierarchies will be allowed as well.

For a more in-depth discussion on file types that make up a module, see #3266.

Q: Git will not work well for sharing binaries like custom Bicep rules and analyzers. Do we need a separate mechanism for packaging those? (It may still remain viable for sharing types if they use a non-binary format.)

Q: Do we need a mechanism for ensuring the integrity of all the files? This could avoid inconsistencies between all the files in a module and also help defend against supply chain attacks.

alex-frankel commented 3 years ago

Thanks @majastrz -- is there a reason we don't have a section on "Creating packages"? We have that in the NuGet and OCI equivalent issues.

majastrz commented 3 years ago

Mainly because there's no package to create. The repo is the "package". However, we will need to discuss what the module repo maintainer's workflow looks like.

majastrz commented 3 years ago

Some additional notes from various discussions:

majastrz commented 3 years ago

Closing since we decided not to use Git for now (see #2128 for more details). That said, we are not closing the door to this approach permanently and may revisit in the future to make it easier to support quick proof-of-concept types of use cases.