actions / runner

The Runner for GitHub Actions :rocket:
https://github.com/features/actions
MIT License
4.77k stars 932 forks source link

Error messages when removing a self-hosted runner #971

Open wyphan opened 3 years ago

wyphan commented 3 years ago

Describe the bug When removing a self-hosted runner, I get the following error messages:

ldd: ./bin/libSystem.Security.Cryptography.Native.OpenSsl.so: No such file or directory
ldd: ./bin/libSystem.IO.Compression.Native.so: No such file or directory
# Runner removal
√ Runner removed successfully
√ Removed .credentials
√ Removed .runner
An error occurred: Access denied. System:ServiceIdentity;DDDDDDDD-DDDD-DDDD-DDDD-DDDDDDDDDDDD needs View permissions to perform the action.

The removal seems to succeed though, as refreshing the page removes the self-hosted runner from the list.

To Reproduce Steps to reproduce the behavior:

  1. Get the removal line from "Settings" tab of the repository, then "Actions" tab, then the three-dot menu next to the self-hosted runner name, and "Remove"
  2. Run the removal line ./config.sh remove --token XXX
  3. See error

Expected behavior A clear and concise description of what you expected to happen.

Runner Version and Platform

Version of your runner? Sorry I forgot to check, but in both cases they were downloaded from the official download links as given out in "Add runner"

OS of the machine running the runner? Linux x86_64. This has happened twice: with Ubuntu Linux 20.04 LTS and CentOS 8.

TingluoHuang commented 3 years ago

image

@wyphan did you click the Force remove this runner button before running the command on the runner?

You should either click Force remove this runner or execute ./run.sh remove, and not doing both to remove the runner from service.

wyphan commented 3 years ago

No, I didn't click "Force remove this runner".

TingluoHuang commented 3 years ago

@wyphan do you mind sharing a link to the repository or organization that you have this runner configured? and also the runner's name if you still remember?

wyphan commented 3 years ago

The two instances of the error message were for two different repositories:

amenocal commented 3 years ago

@wyphan @TingluoHuang I'm also experiencing the same error message.

TingluoHuang commented 3 years ago

Does anyone have the runner diag log available for me to check?

TingluoHuang commented 3 years ago

I think I know what happened. @wyphan @amenocal did you guys start the runner interactively instead of configuring it as a service?

When the interactive runner auto-upgrade to a newer version, it got partially detached from the terminal. STDIN is gone, but STDOUT/ERR still hock to the terminal.

So, after the upgrade, the runner is still running in the background with & and its output will show up in the terminal.

If you run ./config.sh remove to remove the runner without stop the running one, you will see the error about An error occurred: Access denied

hross commented 3 years ago

There are two issues to fix with this:

hross commented 3 years ago

Note that this issue is benign and the runner was still removed.

wyphan commented 3 years ago

@TingluoHuang That is correct. When I was still using them, usually I SSH into the machine, start GNU screen, then start the runner interactively, and detach from the GNU screen session.

Edit: typo

jeremyd2019 commented 2 years ago

I've started getting this error when an ephemeral runner on Windows finishes.

jeremyd2019 commented 2 years ago
2021-12-22 03:16:40Z: Job CLANGARM64 completed with result: Canceled
An error occurred: Access denied. System:ServiceIdentity;DDDDDDDD-DDDD-DDDD-DDDD-DDDDDDDDDDDD needs View permissions to perform the action.

Also of note is that it reports that the job result was Canceled, but the job was not canceled.

[2021-12-22 03:15:55Z INFO JobDispatcher] Successfully renew job request 554, job is valid till 12/22/2021 3:25:55 AM
[2021-12-22 03:16:35Z ERR  GitHubActionsService] GET request to https://pipelines.actions.githubusercontent.com/yCmu0F2oGfbA9DkO6Byr4wKOkszFHnzaBFmAngWq8HAMcu3T9a/_apis/distributedtask/pools/1/messages?sessionId=a2055832-e3bc-4d2c-9226-ea321e364000&lastMessageId=1 failed. HTTP Status: Forbidden, AFD Ref: Ref A: 08A8A04A5F8C4272A9D3805928073430 Ref B: ASHEDGE1213 Ref C: 2021-12-22T03:16:35Z
[2021-12-22 03:16:35Z ERR  MessageListener] Catch exception during get next message.
[2021-12-22 03:16:35Z ERR  MessageListener] GitHub.DistributedTask.WebApi.AccessDeniedException: Access denied. System:ServiceIdentity;DDDDDDDD-DDDD-DDDD-DDDD-DDDDDDDDDDDD needs View permissions to perform the action.
   at GitHub.Services.WebApi.VssHttpClientBase.HandleResponseAsync(HttpResponseMessage response, CancellationToken cancellationToken)
   at GitHub.Services.WebApi.VssHttpClientBase.SendAsync(HttpRequestMessage message, HttpCompletionOption completionOption, Object userState, CancellationToken cancellationToken)
   at GitHub.Services.WebApi.VssHttpClientBase.SendAsync[T](HttpRequestMessage message, Object userState, CancellationToken cancellationToken)
   at GitHub.Services.WebApi.VssHttpClientBase.SendAsync[T](HttpMethod method, IEnumerable`1 additionalHeaders, Guid locationId, Object routeValues, ApiResourceVersion version, HttpContent content, IEnumerable`1 queryParameters, Object userState, CancellationToken cancellationToken)
   at GitHub.Runner.Listener.MessageListener.GetNextMessageAsync(CancellationToken token)
[2021-12-22 03:16:35Z INFO MessageListener] Non-retriable exception: Access denied. System:ServiceIdentity;DDDDDDDD-DDDD-DDDD-DDDD-DDDDDDDDDDDD needs View permissions to perform the action.
[2021-12-22 03:16:35Z INFO JobDispatcher] Shutting down JobDispatcher. Make sure all WorkerDispatcher has finished.
[2021-12-22 03:16:35Z INFO JobDispatcher] Ensure WorkerDispather for job d4d6bd6c-c8bd-59d0-ecca-16324ffb3d87 run to finish, cancel any running job.
[2021-12-22 03:16:35Z INFO JobDispatcher] Send job cancellation message to worker for job d4d6bd6c-c8bd-59d0-ecca-16324ffb3d87.
[2021-12-22 03:16:35Z INFO ProcessChannel] Sending message of length 0, with hash 'e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855'
[2021-12-22 03:16:40Z INFO ProcessInvokerWrapper] Scan all processes to find relationship between all processes.
[2021-12-22 03:16:40Z INFO ProcessInvokerWrapper] Find all child processes of process '6652'.
[2021-12-22 03:16:40Z INFO ProcessInvokerWrapper] Need kill all child processes trees before kill process '6652'.
[2021-12-22 03:16:40Z INFO ProcessInvokerWrapper] Child process '6740' needs be killed first.
[2021-12-22 03:16:40Z INFO ProcessInvokerWrapper] Find all child processes of process '6740'.
[2021-12-22 03:16:40Z INFO ProcessInvokerWrapper] Kill process '6740'.
[2021-12-22 03:16:40Z INFO ProcessInvokerWrapper] Kill process '6652'.
[2021-12-22 03:16:40Z INFO ProcessInvokerWrapper] Finished process 6652 with exit code 100, and elapsed time 02:24:48.8044470.
[2021-12-22 03:16:40Z INFO JobDispatcher] finish job request for job d4d6bd6c-c8bd-59d0-ecca-16324ffb3d87 with result: Canceled
[2021-12-22 03:16:40Z INFO Terminal] WRITE LINE: 2021-12-22 03:16:40Z: Job CLANGARM64 completed with result: Canceled
[2021-12-22 03:16:40Z INFO JobDispatcher] Stop renew job request for job d4d6bd6c-c8bd-59d0-ecca-16324ffb3d87.
[2021-12-22 03:16:40Z INFO JobDispatcher] job renew has been canceled, stop renew job request 554.
[2021-12-22 03:16:40Z INFO JobNotification] Entering JobCompleted Notification
[2021-12-22 03:16:40Z INFO JobNotification] Entering EndMonitor
[2021-12-22 03:16:40Z INFO JobDispatcher] Fire signal for one time used runner.
[2021-12-22 03:16:40Z ERR  GitHubActionsService] DELETE request to https://pipelines.actions.githubusercontent.com/yCmu0F2oGfbA9DkO6Byr4wKOkszFHnzaBFmAngWq8HAMcu3T9a/_apis/distributedtask/pools/1/sessions/a2055832-e3bc-4d2c-9226-ea321e364000 failed. HTTP Status: Forbidden, AFD Ref: Ref A: 9082B93F2F8D401B97858A82DA5ACD1B Ref B: ASHEDGE1213 Ref C: 2021-12-22T03:16:40Z
[2021-12-22 03:16:40Z INFO Runner] Ignore any exception during DeleteSession for an ephemeral runner. GitHub.DistributedTask.WebApi.AccessDeniedException: Access denied. System:ServiceIdentity;DDDDDDDD-DDDD-DDDD-DDDD-DDDDDDDDDDDD needs View permissions to perform the action.
   at GitHub.Services.WebApi.VssHttpClientBase.HandleResponseAsync(HttpResponseMessage response, CancellationToken cancellationToken)
   at GitHub.Services.WebApi.VssHttpClientBase.SendAsync(HttpRequestMessage message, HttpCompletionOption completionOption, Object userState, CancellationToken cancellationToken)
   at GitHub.Services.WebApi.VssHttpClientBase.SendAsync(HttpMethod method, Guid locationId, Object routeValues, ApiResourceVersion version, HttpContent content, IEnumerable`1 queryParameters, Object userState, CancellationToken cancellationToken)
   at GitHub.DistributedTask.WebApi.TaskAgentHttpClientBase.DeleteAgentSessionAsync(Int32 poolId, Guid sessionId, Object userState, CancellationToken cancellationToken)
   at GitHub.Runner.Listener.MessageListener.DeleteSessionAsync()
   at GitHub.Runner.Listener.Runner.RunAsync(RunnerSettings settings, Boolean runOnce)
[2021-12-22 03:16:40Z ERR  Terminal] WRITE ERROR: An error occurred: Access denied. System:ServiceIdentity;DDDDDDDD-DDDD-DDDD-DDDD-DDDDDDDDDDDD needs View permissions to perform the action.
[2021-12-22 03:16:40Z ERR  Listener] GitHub.DistributedTask.WebApi.AccessDeniedException: Access denied. System:ServiceIdentity;DDDDDDDD-DDDD-DDDD-DDDD-DDDDDDDDDDDD needs View permissions to perform the action.
   at GitHub.Services.WebApi.VssHttpClientBase.HandleResponseAsync(HttpResponseMessage response, CancellationToken cancellationToken)
   at GitHub.Services.WebApi.VssHttpClientBase.SendAsync(HttpRequestMessage message, HttpCompletionOption completionOption, Object userState, CancellationToken cancellationToken)
   at GitHub.Services.WebApi.VssHttpClientBase.SendAsync[T](HttpRequestMessage message, Object userState, CancellationToken cancellationToken)
   at GitHub.Services.WebApi.VssHttpClientBase.SendAsync[T](HttpMethod method, IEnumerable`1 additionalHeaders, Guid locationId, Object routeValues, ApiResourceVersion version, HttpContent content, IEnumerable`1 queryParameters, Object userState, CancellationToken cancellationToken)
   at GitHub.Runner.Listener.MessageListener.GetNextMessageAsync(CancellationToken token)
   at GitHub.Runner.Listener.MessageListener.GetNextMessageAsync(CancellationToken token)
   at GitHub.Runner.Listener.Runner.RunAsync(RunnerSettings settings, Boolean runOnce)
   at GitHub.Runner.Listener.Runner.RunAsync(RunnerSettings settings, Boolean runOnce)
   at GitHub.Runner.Listener.Runner.RunAsync(RunnerSettings settings, Boolean runOnce)
   at GitHub.Runner.Listener.Runner.ExecuteCommand(CommandSettings command)
   at GitHub.Runner.Listener.Program.MainAsync(IHostContext context, String[] args)

The ephemeral runner is removed from the org, but the .runner and .credentials are still present on the runner itself, whereas before this started happening those were removed when the ephemeral runner shut down.

kartikv11 commented 2 years ago

Facing the same issue via clicking the Force removal of self-runner

jgutierrezglez commented 2 years ago

Is there any plan to fix this issue? We're facing several of these errors daily as we're relying on ephemeral self-hosted runners..

maartengryp-liantis commented 1 year ago

Any update on this issue? I'm facing this issue daily with enterprise-level ephemeral self-hosted runners (containerized)... It is blocking us to implement proper runner autoscaling

Nuru commented 1 year ago

I am seeing runners not picking up jobs, staying idle, then exiting with this

An error occurred: Access denied. System:ServiceIdentity;DDDDDDDD-DDDD-DDDD-DDDD-DDDDDDDDDDDD needs View permissions to perform the action.
Runner listener exit with retryable error, re-launch runner in 5 seconds.
Restarting runner...

√ Connected to GitHub

Failed to create a session. The runner registration has been deleted from the server, please re-configure.
Runner listener exit with terminated error, stop the service, no retry needed.
Exiting runner...
2023-04-02 04:27:45.294  NOTICE --- Runner init exited. Exiting this process with code 0 so that the container and the pod is GC'ed Kubernetes soon.
iv0rish commented 1 year ago

I am experiencing same issue with actions-runner-controller on AWS EKS after trying force remove the runner. All of my runner pods are keep created and terminated itself within 2 minutes. Any updates on this or workaroud to avoid terminating the runner?

pdeva commented 11 months ago

we are seeing this issue too

matanbaruch commented 6 months ago

Any update on this?

joaoluiznaufel commented 5 months ago

+1

nickyfoster commented 5 months ago

This is still relevant.


[RUNNER 2024-04-09 19:46:59Z INFO Runner] Deleting Runner Session...
[RUNNER 2024-04-09 19:46:59Z ERR  GitHubActionsService] DELETE request to https://pipelinesghubeus2.actions.githubusercontent.com/7aXbNwB1hnEgXD7F3ryv46BCYhHdYXwKwdh/_apis/distributedtask/pools/1/sessions/1f3eea89-41ab-4dc4-afa1-c6hf438dh3685 failed. HTTP Status: Forbidden
[RUNNER 2024-04-09 19:46:59Z INFO Runner] Ignore any exception during DeleteSession for an ephemeral runner. GitHub.DistributedTask.WebApi.AccessDeniedException: Access denied. System:ServiceIdentity;DDDDDDDD-DDDD-DDDD-DDDD-DDDDDDDDDDDD needs View permissions to perform the action.
[RUNNER 2024-04-09 19:46:59Z INFO Runner]    at GitHub.Services.WebApi.VssHttpClientBase.HandleResponseAsync(HttpResponseMessage response, CancellationToken cancellationToken)
[RUNNER 2024-04-09 19:46:59Z INFO Runner]    at GitHub.Services.WebApi.VssHttpClientBase.SendAsync(HttpRequestMessage message, HttpCompletionOption completionOption, Object userState, CancellationToken cancellationToken)
[RUNNER 2024-04-09 19:46:59Z INFO Runner]    at GitHub.Services.WebApi.VssHttpClientBase.SendAsync(HttpMethod method, Guid locationId, Object routeValues, ApiResourceVersion version, HttpContent content, IEnumerable`1 queryParameters, Object userState, CancellationToken cancellationToken)
[RUNNER 2024-04-09 19:46:59Z INFO Runner]    at GitHub.DistributedTask.WebApi.TaskAgentHttpClientBase.DeleteAgentSessionAsync(Int32 poolId, Guid sessionId, Object userState, CancellationToken cancellationToken)
[RUNNER 2024-04-09 19:46:59Z INFO Runner]    at GitHub.Runner.Listener.MessageListener.DeleteSessionAsync()
[RUNNER 2024-04-09 19:46:59Z INFO Runner]    at GitHub.Runner.Listener.Runner.RunAsync(RunnerSettings settings, Boolean runOnce)
[RUNNER 2024-04-09 19:46:59Z INFO Listener] Runner execution been cancelled.```
MmAtBosch commented 5 months ago

Why is this not fixable for three years now?

is it possible that you only kill the run.sh process which does not affect the other two?

when i check the processes, i can see 3 runner processes.

actions-runner$ ps ax
    PID TTY      STAT   TIME COMMAND
      1 pts/0    Ss     0:00 bash
   1671 pts/0    S      0:00 /bin/bash ./run.sh
   1675 pts/0    S      0:00 /bin/bash /azp/actions-runner/run-helper.sh
   1679 pts/0    Sl     0:01 /azp/actions-runner/bin/Runner.Listener run
   1695 pts/0    R+     0:00 ps ax

Try 1 - using ./config.sh remove only

actions-runner$ kill 1671
actions-runner$ ps ax
    PID TTY      STAT   TIME COMMAND
      1 pts/0    Ss     0:00 bash
   1675 pts/0    S      0:00 /bin/bash /azp/actions-runner/run-helper.sh
   1679 pts/0    Sl     0:01 /azp/actions-runner/bin/Runner.Listener run
   1696 pts/0    R+     0:00 ps ax
[1]+  Terminated              ./run.sh
actions-runner$ ./config.sh remove --token AAAAADYADRIFYHCKPHAO7F3GC246I

# Runner removal

√ Runner removed successfully
√ Removed .credentials
√ Removed .runner

actions-runner$ An error occurred: Access denied. System:ServiceIdentity;DDDDDDDD-DDDD-DDDD-DDDD-DDDDDDDDDDDD needs View permissions to perform the action.
Runner listener exit with retryable error, re-launch runner in 5 seconds.

actions-runner$ ps ax
    PID TTY      STAT   TIME COMMAND
      1 pts/0    Ss     0:00 bash
   1752 pts/0    R+     0:00 ps ax
actions-runner$

Try 2 killing two processes before ./config.sh remove

actions-runner$ ps ax
    PID TTY      STAT   TIME COMMAND
      1 pts/0    Ss     0:00 bash
   1811 pts/0    S      0:00 /bin/bash ./run.sh
   1815 pts/0    S      0:00 /bin/bash /azp/actions-runner/run-helper.sh
   1819 pts/0    Sl     0:01 /azp/actions-runner/bin/Runner.Listener run
   1835 pts/0    R+     0:00 ps ax
actions-runner$ kill 1811 1815
actions-runner$ ps ax
    PID TTY      STAT   TIME COMMAND
      1 pts/0    Ss     0:00 bash
   1819 pts/0    Sl     0:01 /azp/actions-runner/bin/Runner.Listener run
   1838 pts/0    R+     0:00 ps ax
actions-runner$ ./config.sh remove --token AAAAAD52TWEY7ILELITGOGLGC25IM

# Runner removal

√ Runner removed successfully
√ Removed .credentials
√ Removed .runner

actions-runner$ An error occurred: Access denied. System:ServiceIdentity;DDDDDDDD-DDDD-DDDD-DDDD-DDDDDDDDDDDD needs View permissions to perform the action.

Try 3 killing all three processes before ./config.sh remove - works

actions-runner$ ps ax
    PID TTY      STAT   TIME COMMAND
      1 pts/0    Ss     0:00 bash
   1948 pts/0    S      0:00 /bin/bash ./run.sh
   1952 pts/0    S      0:00 /bin/bash /azp/actions-runner/run-helper.sh
   1956 pts/0    Sl     0:01 /azp/actions-runner/bin/Runner.Listener run
   1972 pts/0    R+     0:00 ps ax
actions-runner$ kill 1948 1952 1956
actions-runner$ ps ax
    PID TTY      STAT   TIME COMMAND
      1 pts/0    Ss     0:00 bash
   1976 pts/0    R+     0:00 ps ax
actions-runner$ ./config.sh remove --token AAAAAD2HSSZ3TDDA3YGSLGTGC25MY

# Runner removal

√ Runner removed successfully
√ Removed .credentials
√ Removed .runner

actions-runner$
Nuru commented 2 months ago

I'm still seeing this error, in this case when an idle runner is terminated because the Node it is on is being deleted as part of autoscaling (down) the Kubernetes cluster:

Logs look approximately like this (ANSI color codes, timestamps, and some other stuff removed)

NOTICE --- Executing actions-runner-controller's SIGTERM handler.
NOTICE --- Note that if this takes more time than terminationGracePeriodSeconds, the runner will be forcefully terminated by Kubernetes, which may result in the in-progress workflow job, if any, to fail.
NOTICE --- Ensuring dockerd is still running.
CONTAINER ID   IMAGE     COMMAND   CREATED   STATUS    PORTS     NAMES
/runner /
NOTICE --- Waiting for the runner to register first.
NOTICE --- Observed that the runner has been registered.
# Runner removal
An error occurred: Access denied. System:ServiceIdentity;DDDDDDDD-DDDD-DDDD-DDDD-DDDDDDDDDDDD needs View permissions to perform the action.
Runner listener exit with retryable error, re-launch runner in 5 seconds.
Does not exist. Skipping Removing runner from the server
√ Removed .credentials
√ Removed .runner
/
NOTICE --- The actions runner process exited.
NOTICE --- Holding on until runner init (pid 9) exits, so that there will hopefully be no zombie processes remaining.
Restarting runner...
An error occurred: Not configured. Run config.(sh/cmd) to configure the runner.
Runner listener exit with terminated error, stop the service, no retry needed.
Exiting runner...
NOTICE --- Graceful stop completed.
NOTICE --- Runner init exited. Exiting this process with code 0 so that the container and the pod is GC'ed Kubernetes soon.