docker-archive / docker-registry

This is **DEPRECATED**! Please go to https://github.com/docker/distribution
Apache License 2.0
2.88k stars 879 forks source link

docker client times out pulling large fs layer from S3-backed registry #956

Closed chaiken-verticloud closed 9 years ago

chaiken-verticloud commented 9 years ago

At Altiscale, we have been running a standalone docker registry backed by S3. The registry works fine with STORAGE_REDIRECT=true. We're tightening up registry security and bandwidth usage by setting STORAGE_REDIRECT=false, but our docker clients now time out when fetching large fs layers.

Here's the scenario:

[docker_client_host]$ docker pull registry reghost.altiscale.com/container_name
Pulling repository reghost.altiscale.com/container_name
921a3438ca3d: Pulling image (latest) from reghost.altiscale.com/container_name
921a3438ca3d: Pulling image (latest) from reghost.altiscale.com/container_name, endpoint: https://reghost.altiscale.com/v1/ 
921a3438ca3d: Pulling metadata 
921a3438ca3d: Pulling fs layer 
        if not bytes_range and key.size > 1024 * 1024:
            # Use the parallel key only if the key size is > 1MB                                                                                                                  
            # And if bytes_range is not enabled (since ParallelKey is already                                                                                                     
            # using bytes range)                                                                                                                                                  
            key = ParallelKey(key)
self._tmpfile = tempfile.NamedTemporaryFile(mode='rb')
with open(fname, 'wb') as f:
            f.seek(min_cur)
            brange = 'bytes={0}-{1}'.format(min_cur, max_cur)
            boto_key.get_contents_to_file(f, headers={'Range': brange})
            boto_key.close()
921a3438ca3d: Error pulling image (latest) from reghost.altiscale.com/container_name, Get https://reghost.altiscale.com/v1/images/921a3438ca3d58320b54f1117a992b5ce2c2f7305ddb354250364a19fb3d37e6/layer: read tcp ip_address_of_reghost:443: i/o timeout 

I have confirmed this same behavior using curl directly on the docker registry host. This curl command demonstrates the same behavior, except that curl doesn't time out so that the initial download eventually works. When running the curl command, curl does not download the first byte of the file until the first of the 5 docker registry threads calls ParallelKey._refresh_max_completed_byte. Once the first thread completes, curl starts downloading the layer.

[reghost.altiscale.com]$ curl -v -O http://localhost:5000/v1/images/921a3438ca3d58320b54f1117a992b5ce2c2f7305ddb354250364a19fb3d37e6/layer

This curl command uses the Range header to disable ParallelKey. This header makes the if not bytes_range conditional in the code above false, so that ParallelKey does not run. In this case, curl starts downloading bytes immediately.

[reghost.altiscale.com]$  curl -v -O -H 'Range: bytes=0-' http://localhost:5000/v1/images/921a3438ca3d58320b54f1117a992b5ce2c2f7305ddb354250364a19fb3d37e6/layer

Here are some potential solutions for the problem that I would be happy to contribute as code:

  1. Modify boto.py:ParallelKey._fetch_part so that it limits the size of any individual transfer, and therefore calls ParallelKey._refresh_max_completed_byte faster. This change would be pretty easy, but each thread would still work on its own segment of the file. The result is that the streaming transfer of the client would be rate limited by single thread transfer performance.
  2. Modify boto.py:ParallelKey so that each thread picks up a new range of bytes to transfer as close to the beginning of the object as possible, but not overlapping with the ranges being transferred by other threads. This change would be more substantial (and harder to get right), but the all of the threads would contribute to the rate that the client would be able to transfer.
  3. Modify the docker-registry configuration code to allow the operator to disable ParallelKey entirely, or if an S3 object is over a configured size.
  4. Remove the ParallelKey code from boto.py.
  5. Provide the ability to configure the docker client with a longer timeout. This approach would eliminate the need to modify docker-registry code, but does not improve performance for the client by allowing it to do a streaming transfer in parallel with the docker registry's transfer from S3.
  6. Provide the ability for the docker client to disable ParallelKey by passing the Range HTTP header to the docker registry. This approach would work with the current docker registry code, but seems like it's too much of a hack because it depends on the internal implementation of boto.py:Base.stream_read.

Any thoughts on the best way to proceed?

...or maybe I missed some obvious way to fix this issue? Please let me know if there is some obvious way to get a standalone registry working with STORAGE_REDIRECT=false and large fs layers stored as S3 objects!

dmp42 commented 9 years ago

Ok, that sounds bad enough.

Given the effort is now on https://github.com/docker/distribution (new golang implementation), little effort is going to be put in the python-registry here.

My suggestion is either remove the parallel key code entirely, or better, make it configurable (and disable it by default).

If you are willing to PR this, that would be awesome...

chaiken-verticloud commented 9 years ago

Thanks for the quick response, Olivier! I also noticed issues with tmp files not being cleaned up, and saw that others had already reported similar problems (like issue #954). It seems like the best thing to do at this point is to simplify the code by removing the ParallelKey class, rather than tracking down the corner cases that lead to orphaned tmpfiles. I'll take care of removing the code and submit a pull request.

chaiken-verticloud commented 9 years ago

Here's a draft of the code, which passes flake8 and seems to work on my test instance: https://github.com/docker/docker-registry/compare/master...Altiscale:dc_OPS-5907

I'll do a bit more testing, and then submit the pull request.

dmp42 commented 9 years ago

Fantastic. Give me a ping as soon as you have a PR, I'll make sure it's reviewed quickly.

chaiken-verticloud commented 9 years ago

I'm moving my performance test results into this document...

https://docs.google.com/spreadsheets/d/1VbdtRL8w8eAf0sKse1qk7vLmd-J10tIiyeUqCTjSfL4/edit?usp=sharing

tl;dr: The pull request fixes the problem and doesn't change performance otherwise. As expected, STORAGE_REDIRECT provides better bandwidth and therefore better latency for multiple simultaneous clients. For access by a single client, performance mostly depends on the networking between the client and the registry, rather than the code running on the registry.

dmp42 commented 9 years ago

Thanks a lot for your work on this!