dragonflyoss / Dragonfly

This repository has be archived and moved to the new repository https://github.com/dragonflyoss/Dragonfly2.
https://d7y.io
Apache License 2.0
6k stars 773 forks source link

Proposal: Integrate trusted cloud-native registry Harbor with Dragonfly to provide a joint image management and distribution solution to support containerized environments #108

Open steven-zou opened 6 years ago

steven-zou commented 6 years ago

STATUS: [INPROGRESS]

Integrate trusted cloud-native registry Harbor with Dragonfly to provide a joint image management and distribution solution to support containerized environments.

Backgrounds:

Harbor: Project Harbor is an open source trusted cloud-native registry project that stores, signs, and scans content. Harbor extends the open source Docker Distribution by adding the functionalities usually required by users such as security, identity, and management. Having a registry closer to the build and run environment can improve the image transfer efficiency. Harbor supports replication of images between registries and also offers advanced security features such as user management, access control, and activity auditing. For more details, please refer to README.

Dragonfly: Dragonfly is an intelligent P2P based file distribution system. It aims to resolve issues related to low-efficiency, low-success rate and a waste of network bandwidth in file transferring process. Especially in large-scale file distribution scenarios such as application distribution, cache distribution, log distribution, image distribution, etc. For more details, please refer to README

Motivations:

With the emergence and development of Kubernetes, it's becoming possible to run and operate large-scale containerized applications and services in enterprise environments. Meanwhile, there are still existing big challenges which cannot be ignored. How to securely and effectively manage the lots of container images produced in the enterprise organizations and distribute them to the large-scale runtimes with less time and efforts when starting applications or services on demand. To address the above challenge, we should build a joint solution from the open source trust cloud-native registry Harbor and the open source intelligent P2P based file distribution system Dragonfly.

These two open sourced projects have very obviously complementary advantages to each other and the joint solution will definitely expand the scenarios of image lifecycle management and improve the securities, reliabilities, and efficiencies.

Idea:

Basic Workflow:

harbor dragonfly

Architecture:

An architecture design based on the above draft idea: dragonfly h

The components with light blue background are the new things need to be implemented.

Followups:

lowzj commented 6 years ago

Very looking forward to the integration of Dragonfly and Harbor.

In order to complete the task of publishing images to the SuperNode, Dragonfly's internal workflow is:

  1. gets the publishing image task from Image Distribution Driver
  2. gets the URLs of all layers of the image from container image registry, Harbor Registry
  3. sends tasks to all SuperNodes of Dragonfly to pre-download the image's layers, and periodically records the task status for query
  4. then if anyone want to pull that image, Dragonfly will help him to download the image by layer via P2P network established by SuperNode

So there are some features need to be completed in Dragonfly:

And something need to be confirmed:

lowzj commented 6 years ago

I drew a new architecture design graph to add some new components of dragonfly need to be implemented. image

steven-zou commented 6 years ago

I think the following diagram is also a good reference for image distribution: dfget-combine-container

lowzj commented 6 years ago

Todo List of Dragonfly

lowzj commented 6 years ago

08.15 the specification of API between Dragonfly and Harbor

document: https://github.com/alibaba/Dragonfly/blob/master/docs/en/preheat.md

steven-zou commented 6 years ago

@lowzj

About API /api/preheat, could we make it compatible with registry API? Harbor is a registry, that means a client can use the standard docker client or registry API to get the image content.

Seems the designed API needs a new Harbor API to do that. If I'm mistaken, please correct me.

https://docs.docker.com/registry/spec/api/#pulling-an-image

lowzj commented 6 years ago

@steven-zou

There is no need to create a new Harbor API. Firstly, Dragonfly assembles the minifest url according to the registry API spec and param url of /api/preheat, and fetches minifest to get all the urls of image layers. Then Dragonfly downloads all the layers from Harbor.

steven-zou commented 6 years ago

@lowzj

So, I think the following example should be ok. One question, why is the header an array? Why not use a map?

{
  "type": "image",
  "url": "https://<harbor_hostname>/v2/library/redis/manifests/latest",
  "header": ["Authorization: Bearer <TOKEN>"]
}
lowzj commented 6 years ago

@steven-zou

I think the following example should be ok.

I think the url could be image url: <harbor_host>/<image_name>:<image_tag>. And the internal steps of Dragonfly could be:

But I'm not sure about that whether the header is enough for authentication of these steps.


why is the header an array? Why not use a map?

There may be multiple message-header fields with the same filed-name in HTTP headers. If use a map, these header fields should be combined into one and each field-value should be separated by comma, like this "field-name: field-value1, field-value2,...".

Using a map may be more convenient than using an array in practice. And the multiple fields with same filed-name is not recommend. I will change the type of header to map.

steven-zou commented 6 years ago

@lowzj

Ok, got it. But please be aware that the <image_name> in harbor has a prefix of a project name like library/redis. You need to take care of that.

gitzl commented 5 years ago

Does Dragonfly integrate with Harbour support HTTPS and do the work now? If we want to modify the source code support HTTPS need to do those work and attention? Very grateful

perriea commented 5 years ago

An update on this issue ?

allencloud commented 5 years ago

An update on this issue ?

It works with Dragonfly since Dragonfly has already supported the preheat API. @perriea

While we have worked out a demo for integration of Harbor and Dragonfly. But I am not sure if the Harbor side has made a plan to release the work. @steven-zou

datavisoryushuzhang commented 5 years ago

Does the preheat API support private Harbor images? (which need docker login)

allencloud commented 5 years ago

Does the preheat API support private Harbor images? (which need docker login)

Yes, it can. You can add the login credentials in the headers. @datavisoryushuzhang

jessiezhou0424 commented 5 years ago

when will dragonfly+harbor demo release, can't wait to use it!