openshift / origin

Conformance test suite for OpenShift
http://www.openshift.org
Apache License 2.0
8.48k stars 4.7k forks source link

git clone errors on public repositories #9241

Closed frankvolkel closed 6 years ago

frankvolkel commented 8 years ago

Giving openshift origin V3 a try with a simple byo runbook setup that sticks closely with the Advanced Installation instructions. However, my builds are having difficulty cloning from any git repository, even public github ones.

HTTPS:

F0609 05:01:49.810021 1 builder.go:204] Error: build error: fatal: unable to access 'https://github.com/blongden/phpinfo.git/': Unable to communicate securely with peer: requested domain name does not match the server's certificate.

SSH:

I0609 05:10:41.228300 1 source.go:197] Downloading "git@github.com:blongden/phpinfo.git" ... F0609 05:10:41.867643 1 builder.go:204] Error: build error: Host key verification failed. fatal: Could not read from remote repository. Please make sure you have the correct access rights and the repository exists.

I have no problems cloning the repo manually on my master.

Any wisdom will be greatly appreciated.

frankvolkel commented 8 years ago

Got this with a private git repo on bitbucket after following through instructions on https://blog.openshift.com/using-ssh-key-for-s2i-builds/:

I0614 07:11:06.130144       1 sti.go:140] Preparing to build test/cakephp-example-13:637883ba
I0614 07:11:06.134062       1 source.go:197] Downloading "git@bitbucket.org:vsee/evisit.git" ...
I0614 07:11:06.874320       1 cleanup.go:23] Removing temporary directory /tmp/s2i-build065205725
I0614 07:11:06.874365       1 fs.go:156] Removing directory '/tmp/s2i-build065205725'
F0614 07:11:06.874822       1 builder.go:204] Error: build error: Warning: Permanently added 'bitbucket.org,198.90.20.95' (RSA) to the list of known hosts.
Permission denied, please try again.
Permission denied, please try again.
Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.

Tested the key and it works. How do I debug source.go and builder.go?

inlandsee commented 8 years ago

And this, also a private repo.

fedora 24
origin 1.1.6
docker 1.10.3

Notes:

I0615 18:43:15.617846       1 builder.go:46] $BUILD env var is {"kind":"Build","apiVersion":"v1","metadata":{"name":"mywork-home-1","namespace":"mywork-home","selfLink":"/oapi/v1/namespaces/mywork-home/builds/mywork-home-1","uid":"041f9fbf-3329-11e6-989c-60a44cae5be2","resourceVersion":"682","creationTimestamp":"2016-06-15T18:43:13Z","labels":{"app":"mywork-home","buildconfig":"mywork-home","openshift.io/build-config.name":"mywork-home","role":"mywork-home"},"annotations":{"openshift.io/build.number":"1"}},"spec":{"serviceAccount":"builder","source":{"type":"Git","git":{"uri":"git@bitbucket.org:myrepo/instant.home.git"},"sourceSecret":{"name":"instant-home-rsa"},"secrets":null},"strategy":{"type":"Source","sourceStrategy":{"from":{"kind":"DockerImage","name":"mywork/s2i-nginx:latest"},"env":[{"name":"BUILD_LOGLEVEL","value":"5"}]}},"output":{"to":{"kind":"DockerImage","name":"172.30.33.213:5000/mywork-home/app:1.0.0"},"pushSecret":{"name":"builder-dockercfg-e8f3x"}},"resources":{},"postCommit":{}},"status":{"phase":"New","outputDockerImageReference":"172.30.33.213:5000/mywork-home/app:1.0.0","config":{"kind":"BuildConfig","namespace":"mywork-home","name":"mywork-home"}}}

I0615 18:43:15.619490       1 builder.go:57] Master version "v1.1.6", Builder version "v1.1.6"
I0615 18:43:15.626877       1 scmauths.go:27] Finding auth for "ssh-privatekey"
I0615 18:43:15.626899       1 scmauths.go:30] Found SCMAuth "ssh-privatekey" to handle "ssh-privatekey"
I0615 18:43:15.626909       1 scmauths.go:45] Setting up SCMAuth "ssh-privatekey"
I0615 18:43:15.627396       1 builder.go:145] Running build with cgroup limits: api.CGroupLimits{MemoryLimitBytes:92233720368547, CPUShares:2, CPUPeriod:100000, CPUQuota:-1, MemorySwap:92233720368547}
I0615 18:43:15.629154       1 sti.go:199] With force pull false, setting policies to if-not-present
I0615 18:43:15.629166       1 sti.go:205] The value of ALLOWED_UIDS is [1-]
I0615 18:43:15.629174       1 sti.go:213] The value of DROP_CAPS is [KILL,MKNOD,SETGID,SETUID,SYS_CHROOT]
I0615 18:43:15.629182       1 cfg.go:45] Locating docker auth for image mywork/s2i-nginx:latest and type PULL_DOCKERCFG_PATH
I0615 18:43:15.629224       1 cfg.go:111] Using Docker authentication configuration in '/root/.docker/config.json'
I0615 18:43:15.629333       1 cfg.go:57] Problem accessing /root/.docker/config.json: stat /root/.docker/config.json: no such file or directory
I0615 18:43:15.629343       1 cfg.go:45] Locating docker auth for image 172.30.33.213:5000/mywork-home/app:1.0.0 and type PUSH_DOCKERCFG_PATH
I0615 18:43:15.629373       1 cfg.go:111] Using Docker authentication configuration in '/var/run/secrets/openshift.io/push/.dockercfg'
I0615 18:43:15.629526       1 cfg.go:83] Using serviceaccount user for Docker authentication for image 172.30.33.213:5000/mywork-home/app:1.0.0
I0615 18:43:15.631226       1 docker.go:355] Using locally available image "mywork/s2i-nginx:latest"
I0615 18:43:15.632299       1 sti.go:232] Creating a new S2I builder with build config: "Builder Name:\t\t\tnginx builder 1.8\nBuilder Image:\t\t\tmywork/s2i-nginx:latest\nSource:\t\t\t\tfile:///tmp/s2i-build823052582/upload/src\nOutput Image Tag:\t\tmywork-home/mywork-home-1:d8c66b7f\nEnvironment:\t\t\tOPENSHIFT_BUILD_NAME=mywork-home-1,OPENSHIFT_BUILD_NAMESPACE=mywork-home,OPENSHIFT_BUILD_SOURCE=git@bitbucket.org:myrepo/instant.home.git,BUILD_LOGLEVEL=5\nIncremental Build:\t\tdisabled\nRemove Old Build:\t\tdisabled\nBuilder Pull Policy:\t\tif-not-present\nPrevious Image Pull Policy:\talways\nQuiet:\t\t\t\tdisabled\nLayered Build:\t\t\tdisabled\nWorkdir:\t\t\t/tmp/s2i-build823052582\nDocker NetworkMode:\t\tcontainer:a7b935cc3aafe26c8d81aa2499ba6c6f45b1abdbfd9d510374c6ec5413f3ae4d\nDocker Endpoint:\t\tunix:///var/run/docker.sock\n"
I0615 18:43:15.633405       1 docker.go:355] Using locally available image "mywork/s2i-nginx:latest"
I0615 18:43:15.636117       1 docker.go:355] Using locally available image "mywork/s2i-nginx:latest"
I0615 18:43:15.636129       1 docker.go:475] Image contains io.openshift.s2i.scripts-url set to 'image:///usr/libexec/s2i'
I0615 18:43:15.636155       1 sti.go:238] Starting S2I build from mywork-home/mywork-home-1 BuildConfig ...
I0615 18:43:15.636163       1 sti.go:140] Preparing to build mywork-home/mywork-home-1:d8c66b7f
I0615 18:43:15.638259       1 source.go:197] Downloading "git@bitbucket.org:myrepo/instant.home.git" ...
I0615 18:43:15.638322       1 source.go:109] git ls-remote git@bitbucket.org:myrepo/instant.home.git --heads
I0615 18:43:15.638356       1 repository.go:298] Executing git ls-remote git@bitbucket.org:myrepo/instant.home.git --heads
I0615 18:43:15.682997       1 repository.go:318] Exec error: exit status 128
I0615 18:43:15.683055       1 repository.go:328] Err: ssh: Could not resolve hostname bitbucket.org: Name or service not known
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.
I0615 18:43:15.683081       1 source.go:132] ssh: Could not resolve hostname bitbucket.org: Name or service not known
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.
I0615 18:43:15.683097       1 cleanup.go:23] Removing temporary directory /tmp/s2i-build823052582
I0615 18:43:15.683104       1 fs.go:156] Removing directory '/tmp/s2i-build823052582'
F0615 18:43:15.683917       1 builder.go:204] Error: build error: ssh: Could not resolve hostname bitbucket.org: Name or service not known
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.
frankvolkel commented 8 years ago

Removing all searchdomain entries in /etc/sysconfig/network-scripts or /etc/resolv.conf file fixed the issue.

This was discovered by testing on a local setup with DHCP addressing instead of a manually assigned IP with the host in my original question, which worked fine, which helped to narrow down the issue to networking differences between both environments.

inlandsee commented 8 years ago

@frankvolkel Eliminating search entries from resolv.conf strikes me as problematic but I was curious to know if it would change the behavior of my s2i build. It didn't.

On Fedora, my resolv.conf links to NetworkManager's /var/run/NetworkManager/resolv.conf which, in its [main] includes the entry 'dns=dnsmasq', which permits dnsmasq to resolve a local wildcard .dev domain. In support, NetworkManager dutifully auto-generates the following resolv.conf:

# Generated by NetworkManager
search local.gateway iswork.dev
nameserver 127.0.0.1

For testing purposes I followed your recommendation by setting NetworkManager's [main] to 'dns=none' and eliminating all content in resolv.conf. After restarting the NetworkManager and dnsmasq services, resolve.conf was indeed empty. Regardless, s2i build behavior was exactly as described in my previous post with exit status 128 after the container attempted a 'git ls-remote'

I begin to wonder if my git clone failure might be unrelated to the other reports on this ticket and instead have something to do with my docker 1.10.3 installation. It is circumstantial evidence but this s2i build is successful on a Fedora 23 system running docker 1.91.

frankvolkel commented 8 years ago

@inlandsee resolv.conf should not have all settings eliminated; it should have a valid namserver entry, otherwise it wouldn't be able to lookup bitbucket's domain for checkout.

Try nameserver 8.8.8.8?

On 16 June 2016 at 19:53, inlandsee notifications@github.com wrote:

@frankvolkel https://github.com/frankvolkel Eliminating search entries from resolv.conf strikes me as problematic but I was curious to know if it would change the behavior of my s2i build. It didn't.

On Fedora, my resolv.conf links to NetworkManager's /var/run/NetworkManager/resolv.conf which, in its [main] includes the entry 'dns=dnsmasq', which permits dnsmasq to resolve a local wildcard .dev domain. In support, NetworkManager dutifully auto-generates the following resolv.conf:

Generated by NetworkManager

search local.gateway iswork.dev nameserver 127.0.0.1

For testing purposes I followed your recommendation by setting NetworkManager's [main] to 'dns=none' and eliminating all content in resolv.conf. After restarting the NetworkManager and dnsmasq services, resolve.conf was indeed empty. Regardless, s2i build behavior was exactly as descibed in my previous post with exit status 128 after the container attempted a 'git ls-remote'

I begin to wonder if my git clone failure might be unrelated to the other reports on this ticket and instead have something to do with my docker 1.10.3 installation. It is circumstantial evidence but this s2i build is successful on a Fedora 23 system running docker 1.91.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/openshift/origin/issues/9241#issuecomment-226463563, or mute the thread https://github.com/notifications/unsubscribe/AAY1xe0JUPbnCsmtTMWwEwN6YZ9VgU_tks5qMTkkgaJpZM4IxpJ5 .

inlandsee commented 8 years ago

@frankvolkel - Thanks but I don't think my issue has anything to do with dns. Because the s2i build adds public dns entries to the container's /etc/resolv.conf the domain resolves properly.

iswork/s2i-nginx    latest  60b0ef28fdde        2 days ago          414.5 MB
docker run --rm -it 60b0ef28fdde bash -il 
bash-4.2$ cat /etc/resolv.conf 
nameserver 8.8.8.8
nameserver 8.8.4.4

I suspect my problem is the result of a permission error when the s2i build fails to pass the associated secret during interactions with the private repository. As there are no ssh keys stored in the container I expect the following manual invocation of the container's 'git ls' operation to fail but that should not be the case when a valid 'sourceSecret' is passed to s2i prior to the build

bash-4.2$ git ls-remote git@bitbucket.org:myrepo/instant.home.git --heads
Permission denied (publickey).
fatal: Could not read from remote repository.

I'm attempting now to get a better understanding of exactly what is going on inside the build.

inlandsee commented 8 years ago

Also, for what its worth the same s2i build/template that fails under Fedora 24, docker 1.10.3 builds successfully on another system running Fedora 23, docker 1.9.1. I could roll back my development system but hate to do that as everything else appears to be working as expected.

inlandsee commented 8 years ago

For anyone who might stumble across my comments in this thread, I skirted my s2i build failures by moving to origin 1.3. The same templates that failed with origin 1.1.6 build and run properly under 1.3 alpha.

fedora 24 origin 1.3.0-alpha docker 1.10.3

Apologies to @frankvolkel for tagging along on an issue I now believe was unrelated to his original post.

knobunc commented 8 years ago

@frankvolkel it sounds like it is resolving to the wrong IP address if changing the search path fixed it. Is that correct? If so, can you find out what IP it resolves to when it fails, and what it resolves to when it works, and get me both resolv.confs?

madalinignisca commented 7 years ago

I have a fresh origin 1.4 installation (Centos7 using Ansible 1.4 branch) and get the same issue.

Cloning "git@github.com:madalinignisca/my-ghost-blog.git" ...
error: build error: Host key verification failed.
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.

Identical when trying https repo url.

They fails the same for the demos (ruby-ex, node-ex).

frankvolkel commented 7 years ago

Hi Madalin,

Do you have a search entry in /etc/resolv.conf; pods can't seem to resolve hostnames properly if there is a search domain.

Cheers

Frank

On 15 February 2017 at 15:49, Madalin Ignisca notifications@github.com wrote:

I have a fresh origin 1.4 installation (Centos7 using Ansible 1.4 branch) and get the same issue.

Cloning "git@github.com:madalinignisca/my-ghost-blog.git" ... error: build error: Host key verification failed. fatal: Could not read from remote repository.

Please make sure you have the correct access rights and the repository exists.

Identical when trying https repo url.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/openshift/origin/issues/9241#issuecomment-279940141, or mute the thread https://github.com/notifications/unsubscribe-auth/AAY1xeSk-TduxA8mMhVIn91zpjZvfTFbks5rcq4FgaJpZM4IxpJ5 .

madalinignisca commented 7 years ago

Hi Frank,

On the host?

I manually changed it to:

# Generated by NetworkManager
# search localdomain
nameserver 8.8.8.8
nameserver 8.8.4.4

and the problem still persists.

frankvolkel commented 7 years ago

Did you restart the host?

For my case I had to restart it to reflect. Also note that network manager is free to change the file so the entry may reappear after restarts; one way to get around that would be to change file attributes to prevent it from being overwritten, or at least that was what I did for my case. On Thu, 16 Feb 2017 at 5:33 AM, Madalin Ignisca notifications@github.com wrote:

Hi Frank,

On the host?

I manually changed it to:

Generated by NetworkManager

search localdomain

nameserver 8.8.8.8 nameserver 8.8.4.4

and the problem still persists.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/openshift/origin/issues/9241#issuecomment-280146143, or mute the thread https://github.com/notifications/unsubscribe-auth/AAY1xQ-AzUQA6FFqbq7y5WzT8QyYpc-Hks5rc28ygaJpZM4IxpJ5 .

madalinignisca commented 7 years ago

Sorry for late response. It works now, for https. I still need to find out how to get key based cloning on ssh, but that's a topic for me.

jam01 commented 7 years ago

Did you get ssh cloning resolved? I'm getting

Cloning "git@bitbucket.org:....../......-camel-api.git" ...
error: build error: Warning: Permanently added 'bitbucket.org,104.192.143.3' (RSA) to the list of known hosts.
Permission denied (publickey).
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.
madalinignisca commented 7 years ago

I did. The documentation was a bit confusing, I might not be at the level of seniors working in data centers, but eventually by retrying a few times, I got the idea how it's working. You ned to set your secret for the build.

shadycuz commented 6 years ago

Have same issue, what is the work around? I don't think I can remove search from my resolv.conf

shadycuz commented 6 years ago

Just some more details. I'm launching my Openshift Cluster using the AWS quickstart. When I don't specify DHCP options. My /etc/resolv.conf looks like this

[ec2-user@ip-10-64-10-71 ~]$ cat /etc/resolv.conf
# Generated by NetworkManager
search ec2.internal
nameserver 10.64.0.2

And the Cluster is fine.

When I do turn on DHCP options to point to my personal DNS server. My resolv.conf looks like this

[root@ip-10-64-10-74 ~]# cat /etc/resolv.conf
# Generated by NetworkManager
search Dev.myexample.org dev.myexample.org
nameserver 10.68.10.10
nameserver 10.68.11.10

The actual hosts have DNS as I can ssh in and clone a git repo but it appears the containers do not.

image

The only difference between these clusters is I have enabled DHCP options on one of them.

shadycuz commented 6 years ago

@danmcp @knobunc ^ Bump =)

openshift-bot commented 6 years ago

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close. Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

openshift-bot commented 6 years ago

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten. Rotten issues close after an additional 30d of inactivity. Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten /remove-lifecycle stale

openshift-bot commented 6 years ago

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen. Mark the issue as fresh by commenting /remove-lifecycle rotten. Exclude this issue from closing again by commenting /lifecycle frozen.

/close

manu5212002 commented 4 years ago

we are getting the same issue. What was the resolution?