Install chaincode tgz packages via network URLs

hyperledger / fabric

Hyperledger Fabric is an enterprise-grade permissioned distributed ledger framework for developing solutions and applications. Its modular and versatile design satisfies a broad range of industry use cases. It offers a unique approach to consensus that enables performance at scale while preserving privacy.

https://wiki.hyperledger.org/display/fabric

Apache License 2.0

15.79k stars 8.86k forks source link

Install chaincode tgz packages via network URLs #3409

Closed jkneubuh closed 2 years ago

jkneubuh commented 2 years ago

When the peer CLI installs a chaincode package, it must reference a .tgz archive from the local file system.

This can be an onerous burden for Fabric admins, as the crafting of the CC package is often performed by hand or via scripted incantations of tar / gzip / etc. to prepare the code.tar.gz, metadata.json, and package.tgz. While the peer CLI provides some support for preparing the CC package, this feature is generally unused as the package typically requires some level of customization relevant to the target deployment. (See #3368 for an example of this in the context of preparing CCaaS package archive with peer lifecycle chaincode install ...) In practice, fabric admins and developers typically "hand roll" the CC package archive.

Improve this scheme by teaching the peer CLI to read chaincode packages from network URLs. This will allow network / dev / CI - pipelines / etc. to construct chaincode packages, publish to a server, installing the CC package from a network URL.

Update the peer CLI such that it has the ability to read chaincode packages from remote URLs.

A very good library supporting this feature is hashicorp/go-getter.

go-getter out of the box supports the following protocols. Additional protocols can be augmented at runtime by implementing the Getter interface.

Local files

Git

Mercurial

HTTP

Amazon S3

Google GCP

Teaching the peer CLI to transfer CC packages via go-getter enables a natural release, build, and promotion strategy. An example of a published chaincode package, including Semantic Release tags, git, and a GitHub pipeline to construct a .tgz can be found at the conga-nft-contract release archive.

Using go-getter URLs, the contract can easily be installed to a network, directly via a git:// or https:// url. e.g.:

peer lifecycle chaincode install git@github.com:hyperledgendary/conga-nft-contract.git/releases/tag/v0.1.0/conga-nft-contract-v0.1.0.tgz

jkneubuh commented 2 years ago

cc: @SamYuan1990 @jt-nti @denyeart @mbwhite @lehors

SamYuan1990 commented 2 years ago

when and where does the tar file download happen? if I am going to install a chaincode as tar.gz to peer1 and peer2, may I know times of download behavior?

if I am correct, the download happens at where we execute peer cli. May I know if we are going to have any design to avoid download same tar file twice?

jkneubuh commented 2 years ago

The package transfer will occur within the peer CLI during "chaincode install" phase:

user runs "peer cc install" URL
peer CLI opens a byte[] stream with go-getter
go-getter retrieves URL contents of tar.gz, spooling to temp
peer CLI consumes bytes from go-getter stream (spooled from temp) and relays to the peer node

Good question about caching. I am not sure if go-getter respects cache-control headers (it may already support this). That said, the chaincode packages are relatively small - Sam are you anticipating environments or cases when the network download of the chaincode package is an issue? If this is the case, note that the peer CLI will still be able to read from a local file:// URL and can be pre-staged with curl / git / etc. ...

SamYuan1990 commented 2 years ago

Alice is installing a chaincode at github....(url) with latest version and Bob merges a PR. Case chaincode upload to peer A is different with chaincode upload to peer B as before and after merge.

jkneubuh commented 2 years ago

I took a look at hashicorp go-getter for a mechanism to resolve the chaincode package archives from "some" network. Go-getter is a bit of a "kitchen sink" in this regard - using the library is going to cause chaos for resolving / scanning / approving / linking in the dependencies... in this case it's probably overkill.

A compromise here is to use the vanilla http libs available in Golang. If the input argument parses correctly as an http/https URL - stage it locally to the file system before reading through the FilesystemIO.ReadFile() routine.

jkneubuh commented 2 years ago

Alice is installing a chaincode at github....(url) with latest version and Bob merges a PR. Case chaincode upload to peer A is different with chaincode upload to peer B as before and after merge.

@SamYuan1990 there are some good discussions on how to "pin" a chaincode as a service package to a particular "build" of the chaincode going in fabric-builder-k8s.

The proposed solution to support the Alice / Bob scenario is to publish a chaincode package referencing an image by Container Registry Digest:

To make sure the Pod always uses the same version of a container image, you can specify the image's digest; replace : with image-name>@<digest (for example, image@sha256:45b23dee08af5e43a7fea6c4cf9c25ccf269ee113168c19722f87876677c5cb2).

When using image tags, if the image registry were to change the code that the tag on that image represents, you might end up with a mix of Pods running the old and new code. An image digest uniquely identifies a specific version of the image, so Kubernetes runs the same code every time it starts a container with that image name and digest specified. Specifying an image by digest fixes the code that you run so that a change at the registry cannot lead to that mix of versions.

When a chaincode-as-a-service package references an image using the <container-registry>/<image-path> @ <image-digest> syntax, the tag is immutable and 1 : 1 with the chaincode build.

It is NOT a good practice to have PR / Git-Ops / Dev-Ops / CI-Ops / etc. deploy from the latest code line.

SamYuan1990 commented 2 years ago

Alice is installing a chaincode at github....(url) with latest version and Bob merges a PR. Case chaincode upload to peer A is different with chaincode upload to peer B as before and after merge.

@SamYuan1990 there are some good discussions on how to "pin" a chaincode as a service package to a particular "build" of the chaincode going in fabric-builder-k8s.

The proposed solution to support the Alice / Bob scenario is to publish a chaincode package referencing an image by Container Registry Digest:

To make sure the Pod always uses the same version of a container image, you can specify the image's digest; replace : with @ (for example, image@sha256:45b23dee08af5e43a7fea6c4cf9c25ccf269ee113168c19722f87876677c5cb2). When using image tags, if the image registry were to change the code that the tag on that image represents, you might end up with a mix of Pods running the old and new code. An image digest uniquely identifies a specific version of the image, so Kubernetes runs the same code every time it starts a container with that image name and digest specified. Specifying an image by digest fixes the code that you run so that a change at the registry cannot lead to that mix of versions.

When a chaincode-as-a-service package references an image using the <container-registry>/<image-path> @ <image-digest> syntax, the tag is immutable and 1 : 1 with the chaincode build.

It is NOT a good practice to have PR / Git-Ops / Dev-Ops / CI-Ops / etc. deploy from the latest code line.

I am agree with latest is not a good practice. if we are going to use image hash from OCI, or chaincode hash as verify. Then I am fine. Which means we are going to verify image hash/chaincode hash right?

Even if we don't use latest, a common case is image_reg/image_path:image_tag instead of image digest. as I am worried about.

SamYuan1990 commented 2 years ago

I suppose that's all we are going to talk about with deploy image with specific tag as you mentioned.

To make sure the Pod always uses the same version of a container image, you can specify the image's digest; replace : with @ (for example, image@sha256:45b23dee08af5e43a7fea6c4cf9c25ccf269ee113168c19722f87876677c5cb2). When using image tags, if the image registry were to change the code that the tag on that image represents, you might end up with a mix of Pods running the old and new code. An image digest uniquely identifies a specific version of the image, so Kubernetes runs the same code every time it starts a container with that image name and digest specified. Specifying an image by digest fixes the code that you run so that a change at the registry cannot lead to that mix of versions.

To avoid tag issue, if we are going to ask user deploy image with specific digest. What are we going to do? Are we going to auto apply image tag case? We request user to use digest?

jkneubuh commented 2 years ago

I set up a peer build with the above approach (read the CC package from http / https URLs) using the stock golang routines (no go-getter), tried it in practice with the new k8s / cc package builder, and honestly I do NOT like the new feature request.

When the chaincode package declares the image@digest, it now correctly and uniquely identifies an immutable revision of the code that will be executed on the blockchain. Adding the network URL in the equation is convenient, but opens up an odd circumstance where the administrator is not really sure... what is actually being submitted to the chain. For instance, this chaincode installation is "really convenient" -- but what does it actually do?

peer lifecycle chaincode install https://rroll.to/hjXVej

Note that the URL above will dynamically "rick roll" a response with a 50% likelihood, or redirect to yahoo.com. What does this mean to install an immutable chaincode pointer from a dynamic service?

It means the approach is a "no go getter."

Overall, the BIG improvement in the CC process is to publish the chaincode package, constrained to immutable image@digests, and launch with external builders. Saving an admin a "one liner" to download a package is not a huge time saver.

I'm going to close this for now -- we can pull it off the shelf if this ever makes sense in the context of cloud functions, CCaaS, etc.

jkneubuh commented 2 years ago

@SamYuan1990 we had a lot of "really lively debate" about how to associate CC images with CC packages. The outcome / conclusions were:

CC packages MUST be declared using the image @ DIGEST syntax.
For application / chaincode development, a CI / CD pipeline (GitHub Actions, Tekton, Argo, Azure Pipelines, JenkinsX, etc. etc.) can be used to infer the Docker Image Digest and construct the CC package. See conga-nft-contract for an example of this in practice as a GitHub action.
Additional customization (full flexibility) will be implemented by development of "external builders"

SamYuan1990 commented 2 years ago

CC packages MUST be declared using the image @ DIGEST syntax.

LGTM as a decision.