Closed paulpet closed 2 years ago
Looking deeper the runner seems to lose all network connectivity when tailscale initializes. It cannot even ping 1.1.1.1.
ping-pong:
runs-on: ubuntu-20.04
steps:
- uses: actions/checkout@v2
- name: Tailscale connection
uses: tailscale/github-action@main
with:
authkey: ${{ secrets.TAILSCALE_AUTHKEY }}
- name: Ping
shell: bash
run: |
ping -c 5 1.1.1.1
This will fail to even ping 1.1.1.1.
Same failure occurs on Ubuntu 18.04 and Ubuntu 20.04 runner instances. I might try recreating the authkey - although the original had a no-expiry set.
I generated a new auth key but it made no difference. A connection was established to tailscale but then no further network activity occurred. I think i'm running out of ideas.
I don't immediately know of a change made 3 days ago.
Approximately 11 days ago we updated the Tailscale version of the runner to 1.24.2: https://github.com/tailscale/github-action/pull/38, is there a chance the problem started longer ago than 3 days?
Nevermind, you were using v1 explicitly so you didn't automatically get the 1.24.2 tailscale client when it submitted.
I don't think it's related to the Tailscale version number, as I think if I run tailscale action v1 (rather than main) it connect with v1.14 and still shows the same issues.
Hoping you or someone can confirm it's not just me having these issues, as I'm out of ideas of what else to look into.
Unfortunately so far as I know, it is just you. There have been no reports to support@tailscale.com about this Action recently, and no-one else has reported an issue here.
Are any of your runners still around, present in the admin panel and not deleted yet? I can look up their telemetry. Since they get an IP address, they must have managed to contact the coordination server at least briefly.
github-fv-az241-70 is still connected.
Let me modify an action to keep a runner connected for the next 20m or so.
I might try recreating the authkey - although the original had a no-expiry set.
Ah:
We're expecting to provide a way to renew API keys: https://tailscale.com/kb/1101/api/ and then one can use an API key to create authkeys as needed.
I noticed that earlier too, but our key wasnt due to expire until Jul 6th. I ended up creating a new one to be certain and it made no difference :(.
github-fv-az90-773 github-fv-az316-221 should remain connected for the next 20 or minutes.
Also - wouldn't the behavior be different even if the auth key had expired, I would expect to still be able to ping 1.1.1.1?
ping-pong: runs-on: ubuntu-20.04 steps: - uses: actions/checkout@v2 - name: Tailscale connection uses: tailscale/github-action@main with: authkey: ${{ secrets.TAILSCALE_AUTHKEY }} - name: Ping shell: bash run: | ping -c 5 1.1.1.1
This will fail to even ping 1.1.1.1.
Same failure occurs on Ubuntu 18.04 and Ubuntu 20.04 runner instances. I might try recreating the authkey - although the original had a no-expiry set.
Does ping work before tailscale? I read github runners run on Azure and ping is disabled by design (and that seems to be the case when I try it)
Having an issue myself sshing into local machine through tailscale using github actions (not exactly sure when exactly it broke or if it's related to tailscale, but I regenerated some keys since it last worked)
edit: from docs:
GitHub hosts Linux and Windows runners on Standard_DS2_v2 virtual machines in Microsoft Azure with the GitHub Actions runner application installed. The GitHub-hosted runner application is a fork of the Azure Pipelines Agent. Inbound ICMP packets are blocked for all Azure virtual machines, so ping or traceroute commands might not work.
You're right it doesn't. I discovered that yesterday after more troubleshooting. My issue purely appears to be DNS related. Tailscale client DNS configuration points at one our servers running dnsmasq. While DNS resolution works for all other clients connected to Tailscale, it suddenly stopped working for the Github actions runners, with no change (that i'm aware of on our side). I wasn't able to replicate the issue on a completely different Tailscale account, so it's likely something I need to figure out how to resolve (or workaround) rather than it being an issue with Tailscale or tailscale action. :-(
Just because it came up recently, another issue which has come up about Magic DNS in container environments is when ip6tables is missing: https://github.com/gitpod-io/gitpod/issues/8049
I didn't catch the previous updates in time, but if you have a github-action runner which recently completed and the ephemeral node hasn't been cleaned up I can look at what it says about errors in setting up DNS.
@paulpet my issue seems to be https://github.com/tailscale/github-action/issues/40 -- disabling manual device auth worked for me as a temporary workaround.
@paulpet my issue seems to be #40 -- disabling manual device auth worked for me as a temporary workaround.
Fantastic @justin-pierce, this worked for me as well. I've spent a ridiculous amount of hours trying to figure this out.
@DentonGentry looks to be a tailscale backend issue which I hope can be resolved soon, so I am going to close this issue
@DentonGentry looks to be a tailscale backend issue which I hope can be resolved soon, so I am going to close this issue
I think I spoke too early, I can't seem to replicate the issue on stand-alone VMs, only on github runners utilizing the tailscale action, so will re-open until things are confirmed.
Thanks for reporting, this was a duplicate of #40, which is now resolved.
Sometime within the last few days (without any changes to the workflow or tailscale config) we've started seeing DNS look up failures when our Linux Ubuntu 20.04 runner is connected to tailscale. Other clients on tailscale continue to work as expected and I have confirmed that the runner establishes a connection to tailscale. I was using the v1 versions of the tailscale action, but tried using the main version with the same results. Any thoughts on how to further troubleshoot?
EDIT: After further examination (see below) it appears all network functionality breaks during a workflow once the tailscale client connects.