harbor 1.2.0rc1 cannot push images - http 500 interal error

choman commented 7 years ago

If you are reporting a problem, please make sure the following information are provided: 1)Version of docker engine and docker-compose.

rhel 7.2
Docker version 17.06.1-ce, build 874a737
docker-compose version 1.9.0dev, build f5ad3e7
harbor 1.2.0rc1

2)Config files of harbor, you can get them by packaging "harbor.cfg" and files in the same directory, including subdirectory. located at: https://gist.github.com/8f6be32e626e6e59c4a430eb98f492f3

3)Log files, you can get them by package the /var/log/harbor/ . located at: https://gist.github.com/828d362973e1b791c0c8f8512b08fbb9

If I do the following everything works

docker login reg.local.net
docker pull hello-world
docker tag hello-world reg.local.net/test2/hw
docker push reg.local.net/test2/hw

If I do the following I receive a http 500

docker login reg.local.net
docker pull ubuntu
docker tag ubuntu reg.local.net/test2/ubuntu
docker push reg.local.net/test2/ubuntu

The push refers to a repository [reg.local.net/l3/ubuntu] a09947e71dc0: Layer already exists 9c42c2077cde: Layer already exists 625c7a2a783b: Layer already exists 25e0901a71b8: Layer already exists 8aa4fcad5eeb: Pushing 120.1MB/120.1MB <-- Retries a few times received unexpected HTTP status: 500 Internal Server Error

reasonerjt commented 7 years ago

Please provide log files under /var/log/harbor/

choman commented 7 years ago

should be in the gist above

choman commented 7 years ago

Log files

adminserver.log.txt jobservice.log.txt mysql.log.txt proxy.log.txt registry.log.txt ui.log.txt

reasonerjt commented 7 years ago

See this in proxy.log:

Aug 22 17:50:53 172.18.0.1 proxy[3632]: 2017/08/22 22:50:53 [crit] 5#0: *80 open() "/etc//nginx/client_body_temp/0000000001" failed (13: Permission denied), client: 192.168.1.215, server: , request: "PATCH /v2/lm/bash/blobs/uploads/4f0291be-dd3c-4ce3-aee4-88d5897d29ae?_state=tl8v61b0mUnffnkskmxshL5wG5N0ujtI5--nLC1n1pR7Ik5hbWUiOiJsbS9iYXNoIiwiVVVJRCI6IjRmMDI5MWJlLWRkM2MtNGNlMy1hZWU0LTg4ZDU4OTdkMjlhZSIsIk9mZnNldCI6MCwiU3RhcnRlZEF0IjoiMjAxNy0wOC0yMlQyMjo1MDo1Mi45NjUyOTE2MjhaIn0%3D HTTP/1.1", host: "reg2.local.net"

reasonerjt commented 7 years ago

looks like a dup of #3062 ? @yixingjia

choman commented 7 years ago

I'm not a nginx guru by any stretch. But running off what @readonerjt found. I google'd https://www.google.com/search?client=ubuntu&channel=fs&q=+nginx++upload+retries++client_body_temp+failed&ie=utf-8&oe=utf-8

and saw these to references

the first one points out that client_body_temp must be owned by the process, the second points to a fix that works. but creates a /tmp/nginx path for client_body_temp. My concern is with it going to /tmp, would the be enough space on large images?

Thoughts on this fix?

BTW I applied this fix in the nginx container above the first reference to client_body Then I tested by hooking the fix into the template and reinstalled harbor

common/templates/nginx/nginx.https.conf

NOTE: if I have a client pull from my repo with this fix, it retries on pulling... looking at this tomorrow, unless someone beats me to it

NOTE: adding "proxy_temp_path /tmp/nginx;" after client_body_temp-path seems to resolve this last issue

Testing more in the morning

ywk253100 commented 7 years ago

@choman It seems a permission issue, could you give me the output of the following commands?

ps -ef | grep nginx
ls -l harbor/common/config/nginx/

choman commented 7 years ago

@ywk253100 I agree, it does look like a permission issue with the /etc/nginx directories and ownership's

according to the two references I found above, nginx needs different ownership's for different processes. my short view of it anyhow.

1) on both 1.2.0rc1 and 1.2.0rc2, if I do not put the following into the template nginx.https.conf

- client_body_temp_path /tmp/nginx;
- proxy_temp_path /tmp/nginx

Then I can only push hello-world, and I fail on alpine, centos and ubuntu images

2) If I put those entries into the template; then I have success pushing and pulling hello-world, alpine, centos and ubuntu images.

NOTE The pull is both to a remote docker machine and the harbor server

My two cents, those entries close this issue. My concern is location in the https.conf as I am not a nginx guru.

    ssl_session_cache shared:SSL:10m;

    client_body_temp_path /tmp/nginx;
    proxy_temp_path /tmp/nginx;

    # disable any limits to avoid HTTP 413 for large image uploads
    client_max_body_size 0;

choman commented 7 years ago

Although I am installing from a vbox share. Let me try and install outside of that

Sent from my Motorola XT1650 using FastHub

choman commented 7 years ago

So the modes of the /etc/nginx/{client_body_temp,proxy_temp} are 770 root,689 when installed from a vbox share

when install from the local filesysetm they are 700 nobody,root.

it would be nice to be able to install from the vbox share, and I'm not sure what barring that should really have but it does. So my fix above is really only needed when installing form the vbox share. A better fix would be to set the modes/ownerships when importing/configuring the image. Either way. I'd like to see a proper fix, because it really should matter where I install harbor from

Should I open a enhancement request? Either way this can certainly be closed

Thanks all

luguang commented 7 years ago

Could this be resolved or deferred before RC? We have very limited time before that.

carlsone commented 7 years ago

I have the same error: Aug 23 22:27:43 172.18.0.1 proxy[513]: 2017/08/23 22:27:43 [crit] 6#0: *110 open() "/etc//nginx/client_body_temp/0000000012" failed (13: Permission denied) ...

Is there a workaround? Can I chmod something to avoid the error?

# ls -l harbor/common/config/nginx/
total 32
drwxr-x--- 2 root   root 4096 Aug 23 22:19 cert
drwx------ 2 nobody root 4096 Aug 23 22:16 client_body_temp
drwxr-x--- 2 root   root 4096 Aug 23 22:15 conf.d
drwx------ 2 nobody root 4096 Aug 23 22:16 fastcgi_temp
-rw-r----- 1 root   root 3162 Aug 23 22:19 nginx.conf
drwx------ 2 nobody root 4096 Aug 23 22:16 proxy_temp
drwx------ 2 nobody root 4096 Aug 23 22:16 scgi_temp
drwx------ 2 nobody root 4096 Aug 23 22:16 uwsgi_temp

# ps -ef | grep nginx
root      4496  4474  0 22:19 ?        00:00:00 nginx: master process nginx -g daemon off;
nobody    4587  4496  0 22:19 ?        00:00:00 nginx: worker process
nobody    4588  4496  0 22:19 ?        00:00:00 nginx: worker process
nobody    4589  4496  0 22:19 ?        00:00:00 nginx: worker process
nobody    4590  4496  0 22:19 ?        00:00:00 nginx: worker process
nobody    4591  4496  0 22:19 ?        00:00:00 nginx: worker process
nobody    4592  4496  0 22:19 ?        00:00:00 nginx: worker process
nobody    4593  4496  0 22:19 ?        00:00:00 nginx: worker process
nobody    4594  4496  0 22:19 ?        00:00:00 nginx: worker process
nobody    4595  4496  0 22:19 ?        00:00:00 nginx: worker process
nobody    4596  4496  0 22:19 ?        00:00:00 nginx: worker process
nobody    4597  4496  0 22:19 ?        00:00:00 nginx: worker process
nobody    4598  4496  0 22:19 ?        00:00:00 nginx: worker process
nobody    4599  4496  0 22:19 ?        00:00:00 nginx: worker process
nobody    4600  4496  0 22:19 ?        00:00:00 nginx: worker process
nobody    4601  4496  0 22:19 ?        00:00:00 nginx: worker process
nobody    4602  4496  0 22:19 ?        00:00:00 nginx: worker process

yixingjia commented 7 years ago

@carlsone can you check the folder owner inside the proxy_temp directory?

a workaround is simple, 1>first delete the proxy_temp and client_body_temp folder 2> login to the nginx container by "docker exec -it nginx bash" 3> issue "nginx -s reload" to restart the nginx, that will solve the issue.

We are still investing the root cause will give a fix for this later.

yixingjia commented 7 years ago

@luguang we have several proposal , worst case is we just disable the buffer, then all permission issue will gone.

yixingjia commented 7 years ago

@carlsone @choman

Can you please do the follow testing and let us know the result, We are trying to reproduce this both on rhel, centos and ubuntu but failed.

1> rm the harbor directory. 2> untar the harbor harbor-offline-installer-v1.2.0-rc1.tgz offline binary 3> apply the change in https://github.com/vmware/harbor/pull/3112 4> then install harbor and retry.

Thanks.

carlsone commented 7 years ago

I was able to workaround the 'push' error by applying the workaround @choman linked to: https://stackoverflow.com/questions/21494979/file-upload-not-working-with-rails-4-in-development-using-pow-and-nginx

After running install.sh from harbor-offline-installer-v1.2.0-rc1.tgz, edit this file: nginx.conf

Add this line (inside the HTTP module): client_body_temp_path /tmp/nginx/;

Then restart the ngnix containers

docker ps -a
docker stop <id for ngnix>
docker start <id for ngnix>

Now push works.

yixingjia commented 7 years ago

Should be fix the PR https://github.com/vmware/harbor/pull/3112

goharbor / harbor

harbor 1.2.0rc1 cannot push images - http 500 interal error #3086