rkt / rkt

[Project ended] rkt is a pod-native container engine for Linux. It is composable, secure, and built on standards.
Apache License 2.0
8.82k stars 886 forks source link

fetch: proxy and prefix mirrors to keep fetch from hitting public internet #894

Open philips opened 9 years ago

philips commented 9 years ago

People may not want to have rkt making outbound internet requests to fetch images. So, along with the option of a local on-disk image store as described in #695 we should consider configuring fetch strategies.

My initial naive idea is that we let people specify either a transparent HTTP proxy and also a prefix mirror.

Proxy

This is straight forward, you would tell rkt fetch through configuration or command line flags that all http requests for fetching and discovery should go through a proxy instead.

$ cat /etc/rkt/fetch.d/10-local-proxy
{
    "rktKind": "fetch",
    "type": "proxy",
    "proxy": "aci.gateway.corp",
    "cert": "blah blah"
}

Prefix

Another common pattern for rpm/apt is to mirror them on a local mirror. To enable this sort of discovery we can have a simple prefix. For example coreos.com/etcd might be mirrored to https://mirror.corp/acis/coreos.com/etcd.

$ cat /etc/rkt/fetch.d/10-local-proxy
{
    "rktKind": "fetch",
    "type": "prefix",
    "url": "https://mirror.corp/acis/",
}

/cc @xiang90

xiang90 commented 9 years ago

/cc @genedna

genedna commented 9 years ago

/cc @victorwangyang

victorwangyang commented 9 years ago

/cc @xiang90 @genedna

There are a couple of key points we have taken account of, any comments are welcome.

Configaration File

We’d like to combine all configuration files into a single one such as:

{
   "storage":{
      "name":"local-volume",
      "path":"/mnt/nfs/rkt/store"
   },
   "prefixs":[
      {
         "name":"mirror-one",
         "kind":"fetch",
         "type":"prefix",
         "url":"https://mirror.corp/acis/"
      }
   ],
   "proxy":{
      "kind":"fetch",
      "type":"https",
      "proxy":"aci.gateway.corp:8080",
      "cert":"blah blah"
   }
}

Storage for image

All files related to the specific container would be saved in the same local directory on disk.

Signature:  example-1.0.0-linux-amd64.aci.asc
ACI:        example-1.0.0-linux-amd64.aci
Keys:       pubkeys.gpg

The local hierarchy directory would be organized such as:

/mnt/nfs/rkt/store/{os}/{arch}/example-1.0.0-linux-amd64.aci.asc
/mnt/nfs/rkt/store/{os}/{arch}/examle-1.0.0-linux-amd64.aci
/mnt/nfs/rkt/store/{os}/{arch}/pubkeys.gpg

Search sequence

There is a draft process of how to use configuration file to get the aci: First, we would get the path from configuration file(/etc/rkt/fetch.d/10-local-proxy),then look for the files we need in the local directory /mnt/nfs/rkt/store/{os}/{arch}/. if this fails, we will continually get the url of prefix, and download the files from local mirror by https://mirror.corp/acis/.if this fails as well, we get the proxy such as aci.gateway.corp:8080 from the configuration file, then download it through the proxy. Any exception occurred during the process above , error or warning would be thrown.

Proxy

We are not sure about the meaning of proxy in the configuration file.

{
    "rktKind": "fetch",
    "type": "proxy",
    "proxy": "aci.gateway.corp",
    "cert": "blah blah"
}

Does it mean that we should handle a couple different kinds of proxy such as http, https, etc?

xiang90 commented 9 years ago

@genedna @victorwangyang

As @philips pointed out in the last comment of https://github.com/coreos/rkt/issues/695, we are still exploring TUF (http://theupdateframework.com, https://github.com/flynn/go-tuf) for serving index of local repo.

So this issue cares more about the proxy and prefix stuff.

Search sequence

Since you have a list of proxies and prefixes, you also need to define the order of proxies to be used. Another thing I want to mention is: the current highest search priority is the CAS. We have an index from url to imageID. See https://github.com/coreos/rkt/blob/master/rkt/images.go#L283

Proxy

  1. If you put proxy, prefix configurations into one JSON file with different keys, then you do no need type field. It can be inferred by the key.
  2. rktKind is the rkt sub-command that will try to use the proxy/prefix configuration. @philips correct?

Does it mean that we should handle a couple different kinds of proxy such as http, https, etc?

I am not quite sure about your question. But at least, in your (and @philips's) proposal, there is no way to distinguish a HTTP proxy and a HTTPS proxy.

xiang90 commented 9 years ago

@genedna @victorwangyang

Jon pointed me to a more accurate explanation of rktKind and the wanted configuration format. Please check https://github.com/coreos/rkt/blob/master/Documentation/configuration.md Basically, we also need to add rktVersion field.

Here is the code part: https://github.com/coreos/rkt/blob/master/rkt/config/config.go#L220-L223

victorwangyang commented 9 years ago

@genedna @xiang90 @x57575

Search Sequence

if there is a list of proxies and prefixes in the configuration file, we would use them by the sequence of loading in, it means that we use the first proxy or prefix when it is read in firstly. we cannot understand clearly about the CAS you mentioned above , would you give us more details? what is the relation between CAS and this issue?

Configuration Modification

According to your opinions ,we just remove the type and add the rktKind and rktVersion. Does it work?

{
   {
    "rktKind": "storage",
    "rktVersion": "v1",
    "name":"local-volume",
    "path":"/mnt/nfs/rkt/store"
   },
   {
     "rktKind":"prefix",
     "rktVersion": "v1",
     "prefixes":[
      {
         "name":"mirror-one",
         "kind":"fetch",
         "url":"https://mirror.corp/acis/"
      }]
   },
   { 
      "rktKind":"proxy",
      "rktVersion": "v1",
      "kind":"fetch", 
      "proxy":"https://aci.gateway.corp:8080",
      "cert":"blah blah"
   }
}

Proxy Type

Do we need to support other proxy type like shadowsocks except http or https? Things like:

  { 
      "rktKind":"proxy",
      "rktVersion": "v1",
      "kind":"fetch", 
      "proxy":"ss://aci.gateway.corp:8080",
      "cert":"blah blah"
   }
thereallukl commented 8 years ago

Did someone implement it already ? Issue is open for that last 6 months, but I didn't see any info about proxy support anywhere in the docs. I am running rkt behind corporate proxy and it's really annoying to specify proxy for each single container I want to run.

Thanks

jonboulle commented 8 years ago

The configuration described in this ticket has not been implemented. However you should be able to just use set the standard http proxy environment variable ($HTTP_PROXY) and rkt will use that for its HTTP operations.

brianredbeard commented 8 years ago

I will respectfully point out that while setting $HTTP_PROXY (which in reality should be $http_proxy despite much of the CoreOS documentation. Thanks CERN.) leaves out the common mirroring case of being able to rsync content and publish. Being able to allow an administrator to map a name locally (not for the purposes of pulling, merely to say mysql.com/mysql-server -> int.example.com/mysql/mysql-server allows an admin to keep a consistent copy of files (which may also be cached nicely via the aforementioned proxy server) and ensure that the names still retain relevant meaning.

scruplelesswizard commented 7 years ago

This tends to be a pretty big pain point for most of our larger corporate clients, and hampers rkt/rktnetes adoption. Any chance we can get this added to an upcoming version?

GreatSUN commented 7 years ago

Also from our side this is very critical, as:

  1. we (and other big companies) have proxies for internet connections, though no direct internet
  2. we also have registry proxies
  3. it's hard to train the people not to use internet repos but local repos
  4. there are a lot of problems heading up when using proxy configurations in Kubernetes clusters
  5. internal customers already going crazy because of this and want to stick to Docker, no matter what this means (too much overhead for the developers)

So please, implement this asap if somehow possible.

Thanks, Stefan

philips commented 7 years ago

@chaosaffe @GreatSUN rkt should support HTTP_PROXY https://gist.github.com/philips/4a0486ca563413fe2a0d60daa1daf667

We do need to plumb this through the kubelet integration though.

lucab commented 7 years ago

https://github.com/coreos/rkt/pull/3303 also fixed some issue in docker-fetching over a proxy. This is going to land in 0.18.0.

TerraTech commented 7 years ago

@philips Will this give rkt fetch the same type of behavior that docker provides with: docker daemon --registry-mirror=http://mirror.example.com:5000

We run an internal registry mirror, mostly because rktnetes is a bit aggressive with its downloads (especially those tagged 'latest'), so we cache as much as we can in an effort to be good netcitizens.

I tried to rkt fetch from our internal v2-mirror (docker://registry:2) service, but it doesn't work right:

# rkt fetch --insecure-options=image coreos.com/etcd,version=v2.3.1
image: searching for app image coreos.com/etcd
fetch: discovery failed

# rkt fetch --insecure-options=image docker://busybox:latest
image: remote fetching from URL "docker://busybox:latest"
fetch: Get https://registry-1.docker.io/v2: Moved Permanently

Is there currently a way to use an internal docker registry mirror? https://docs.docker.com/v1.6/articles/registry_mirror/

xiemeilong commented 7 years ago

quay.io is limited by gfw in china,fetch is very slow, we need able to set proxy。