fubarhouse / ansible-role-golang

Installs the Go programming language and packages on Mac & Linux (Ubuntu, CentOS)
MIT License
105 stars 32 forks source link

Failure on TASK [fubarhouse.golang : Go-Lang | Moving to installation directory] #66

Open markdorison opened 7 years ago

markdorison commented 7 years ago

This role has worked for me in the past but I am now encountering the following error on the "Moving to installation directory" task. I redacted the information about the box it is running on.

fatal: [FQDN_REDACTED -> IP_REDACTED]: FAILED! => {"changed": false, "cmd": "/usr/bin/rsync --delay-updates -F --compress --delete-after --archive --rsh 'ssh -S none -o StrictHostKeyChecking=no' --rsync-path=\"sudo rsync\" --out-format='<<CHANGED>>%i %n%L' \"/tmp/go/\" \"IP_REDACTED:/root/go\"", "failed": true, "msg": "Warning: Permanently added 'IP_REDACTED' (ECDSA) to the list of known hosts.\r\nPermission denied (publickey).\r\nrsync: connection unexpectedly closed (0 bytes received so far) [sender]\nrsync error: unexplained error (code 255) at io.c(226) [sender=3.1.0]\n", "rc": 255}

fubarhouse commented 7 years ago

@markdorison first of all thanks for submitting a ticket!

Second, as you're getting the following: Permission denied (publickey), could you please check the ssh-agent to ensure keys have been added?

You can use the following

echo $SSH_AGENT_PID
eval $(ssh-add)
ssh-add
ssh-add -l

This will rule out any obvious errors. I am intending on going back through this role again for a new release - much like what I've just completed for my curl role.

ctorgalson commented 7 years ago

@fubarhouse: Thanks!

(I work with @markdorison and it was me who ran into this issue). The machine the Ansible playbook was running from does have a passphrase-protected ssh-key to the remote machine, but we know it was ssh-added because the playbook was able to connect to the remote machine at all (and successfully complete all tasks up to the golang role).

But reading through the code, it looks like it's the ansible_ssh_user is undefined that's even forcing the role's use of synchronize instead of shell, and this playbook is usually run without -u, and I don't think we'd found the fubarhouse_user variable.

I'll run the playbook again at a quiet time with a value for user or fubarhouse_user to see if that's the issue & report back.

fubarhouse commented 7 years ago

@ctorgalson that actually describes something I can diagnose much better, so I'll look into this for you and respond.

Under no circumstance should fubarhouse_user not be assigned a value, but at least now I know there is a case where it may not be.

Are you able to provide any given reason the following two tasks would be skipped?

- name: "Go-Lang | Define user variable for ssh use"
  set_fact:
    fubarhouse_user: "{{ ansible_ssh_user }}"
  when: ansible_ssh_user is defined and fubarhouse_user is not defined

- name: "Go-Lang | Define user variable for non-ssh use"
  set_fact:
    fubarhouse_user: "{{ ansible_user_id }}"
  when: ansible_ssh_user is not defined and fubarhouse_user is not defined
ctorgalson commented 7 years ago

@fubarhouse Thanks. I think I can explain what's failing.

Background

Analysis

  1. Since ansible_ssh_user is undefined, the role sets fubarhouse_user to the value of ansible_user_id.

  2. Since ansible_ssh_user is undefined, the role will attempt to use the synchronize task

  3. The synchronize task uses the value of fubarhouse_user for become_user. This is the direct cause of the error we see (but possibly not the actual problem):

    "cmd": "/usr/bin/rsync --delay-updates -F --compress --delete-after --archive --rsh 'ssh  -S none -o StrictHostKeyChecking=no' --rsync-path=\"sudo rsync\" --out-format='<<CHANGED>>%i %n%L' \"/tmp/go/\" \"xxx.xxx.xxx.xxx:/root/go\"",

    Even though ctorgalson is in the sudoers list, that user should become root in order to move files into root's home directory (i.e. it should run rsync with sudo rsync ... and not sudo -u ctorgalson rsync ...); this suggests that the actual problem is something else...

  4. The rsync command shown in the error above attempts to copy files to xxx.xxx.xxx.xxx:/root/go. This shows that the {{ GOROOT }} fact is set to /root/go even though the code appears to try to set it to the fubarhouse_user's home directory.

I think (4) is the core issue, though I'm not sure what an appropriate solution might be--installing in individual users' home directories is not a viable solution for us :)


PS: according to Ansible's documentation, the Synchronize module "...is run and originates on the local host where Ansible is being run". Which sounds like the generated rsync command above might always fail (since the get_url task downloads to the remote host, but the Ansible-generated rsync command's source is /tmp/go and not e.g. xxx.xxx.xxx.xxx:/tmp/go).

fubarhouse commented 7 years ago

@ctorgalson I've actually been doing a bit of work on similar things, but I've rolled some more changes to the dev branch in and kicked off some tests, here's a summary.

You can test this out on the dev-2.5.x branch, but I'll get a release out in the next day for you. I would be appreciative if you could tell me if the above changes solve your problem!

Link to tests:

fubarhouse commented 7 years ago

2.5.0 is officially released, available via the galaxy.

As previously stated, I'd like to know if the changes have resolved your problems.

markdorison commented 7 years ago

@fubarhouse I updated the role to 2.5.0. When attempting a run it fails, but in a different place:

TASK [fubarhouse.golang : Go-Lang | Run get commands] ******************************************************************************************************************************************************* failed: [jenkins.chromatic.is] (item={u'url': u'github.com/StackExchange/dnscontrol', u'name': u'dnscontrol'}) => {"changed": false, "cmd": "/root/go/bin/go get -u github.com/StackExchange/dnscontrol", "delta": "0:00:02.402064", "end": "2017-09-19 19:18:47.305134", "failed": true, "item": {"name": "dnscontrol", "url": "github.com/StackExchange/dnscontrol"}, "rc": 2, "start": "2017-09-19 19:18:44.903070", "stderr": "# runtime\n/root/go/src/runtime/mstkbar.go:151:10: debug.gcstackbarrieroff undefined (type struct { allocfreetrace int32; cgocheck int32; efence int32; gccheckmark int32; gcpacertrace int32; gcshrinkstackoff int32; gcrescanstacks int32; gcstoptheworld int32; gctrace int32; invalidptr int32; sbrk int32; scavenge int32; scheddetail int32; schedtrace int32 } has no field or method gcstackbarrieroff)\n/root/go/src/runtime/mstkbar.go:162:24: division by zero\n/root/go/src/runtime/mstkbar.go:162:43: invalid expression unsafe.Sizeof(composite literal)\n/root/go/src/runtime/mstkbar.go:162:44: undefined: stkbar\n/root/go/src/runtime/mstkbar.go:212:4: gp.stkbar undefined (type *g has no field or method stkbar)\n/root/go/src/runtime/mstkbar.go:213:15: gp.stkbar undefined (type *g has no field or method stkbar)\n/root/go/src/runtime/mstkbar.go:216:23: undefined: stackBarrierPC\n/root/go/src/runtime/mstkbar.go:226:28: gp.stkbarPos undefined (type *g has no field or method stkbarPos)\n/root/go/src/runtime/mstkbar.go:227:19: gp.stkbarPos undefined (type *g has no field or method stkbarPos)\n/root/go/src/runtime/mstkbar.go:248:41: undefined: stkbar\n/root/go/src/runtime/mstkbar.go:227:19: too many errors", "stderr_lines": ["# runtime", "/root/go/src/runtime/mstkbar.go:151:10: debug.gcstackbarrieroff undefined (type struct { allocfreetrace int32; cgocheck int32; efence int32; gccheckmark int32; gcpacertrace int32; gcshrinkstackoff int32; gcrescanstacks int32; gcstoptheworld int32; gctrace int32; invalidptr int32; sbrk int32; scavenge int32; scheddetail int32; schedtrace int32 } has no field or method gcstackbarrieroff)", "/root/go/src/runtime/mstkbar.go:162:24: division by zero", "/root/go/src/runtime/mstkbar.go:162:43: invalid expression unsafe.Sizeof(composite literal)", "/root/go/src/runtime/mstkbar.go:162:44: undefined: stkbar", "/root/go/src/runtime/mstkbar.go:212:4: gp.stkbar undefined (type *g has no field or method stkbar)", "/root/go/src/runtime/mstkbar.go:213:15: gp.stkbar undefined (type *g has no field or method stkbar)", "/root/go/src/runtime/mstkbar.go:216:23: undefined: stackBarrierPC", "/root/go/src/runtime/mstkbar.go:226:28: gp.stkbarPos undefined (type *g has no field or method stkbarPos)", "/root/go/src/runtime/mstkbar.go:227:19: gp.stkbarPos undefined (type *g has no field or method stkbarPos)", "/root/go/src/runtime/mstkbar.go:248:41: undefined: stkbar", "/root/go/src/runtime/mstkbar.go:227:19: too many errors"], "stdout": "", "stdout_lines": []}

markdorison commented 7 years ago

The cause of this failure seems to be further upstream in the playbook as a bunch of tasks are being skipped and go is not being installed successfully. Investigating further.

fubarhouse commented 7 years ago

@markdorison,

I have just identified the problem, so I'll get a fix under way asap.

Edit: see See 0231ee845e02f153b3745b6f3716f9b50306606a

I'm just waiting for some tests (now running) to complete and I'll release it.

Edit:

2.6.1 is released, which includes the above commit.

Changelog will be added tonight, but it's available via the galaxy.

Edit (again):

If the distribution tasks are skipping, the removal of the old Go install will also fail.

It's my recommendation to delete your GOROOT in the event this fails again.