Closed kierans closed 4 months ago
Hi @kierans
I need more information to understand what is happening here.
Things that would help:
@peter-evans Thanks for looking at this.
This issue is in the context of a corporate project. I’ve updated the description with more info about the workflow job. I’ve uploaded logs of the run. I’ve had to redact some of the logs, but I think there’s enough there for you to see what’s happening in the run. The GHA run is not public I so can’t give you the link.
Everything looks fine, so my guess is that this problem is caused by some issue with running on self-hosted. I think if you tried this on hosted (github.com) it would work. That could be a way you can narrow down the problem.
This was working with v5
. Dependabot upgraded the action to v6
. I’ve reverted back to v5
Looking at a comparison between v5 and v6 I’m wondering if the change to the way git fetch is invoked might be the cause? Maybe it’s a git version issue?
This was working with
v5
. Dependabot upgraded the action tov6
. I’ve reverted back tov5
Can you confirm that it works now after reverting back. If that's the case I can try to narrow this down.
I’m wondering if the change to the way git fetch is invoked might be the cause?
Looking at the log, though, it's already past that point and created the PR branch successfully. The point where it fails is when it's calling the GitHub API to create the PR.
I think it's this change: https://github.com/peter-evans/create-pull-request/compare/main...infer-urls#diff-149194b9a050dce062d6db2902041d8d9b9c28245428d7435ed58a87fc25b3dc
I added a feature to infer the URLs of repositories so that PRs could be made across different GitHub platforms.
Are you using GHES or GHEC? It looks like the repo has a https://github.com/$repo
URL, so perhaps not. Can you confirm your situation. The repo is hosted normally on github.com and you are using self-hosted runners?
This is the logic I added, which a suspect may not be working for your use case.
if (githubServerHostname !== 'github.com') {
options.baseUrl = `https://${githubServerHostname}/api/v3`
} else {
options.baseUrl = 'https://api.github.com'
}
This was working with
v5
. Dependabot upgraded the action tov6
. I’ve reverted back tov5
Can you confirm that it works now after reverting back. If that's the case I can try to narrow this down.
It is.
Are you using GHES or GHEC? It looks like the repo has a
https://github.com/$repo
URL, so perhaps not. Can you confirm your situation. The repo is hosted normally on github.com and you are using self-hosted runners?
Correct. The repo URL is https://github.com/$ORG/$REPO
, and the organisation is using self hosted runners.
Correct. The repo URL is
https://github.com/$ORG/$REPO
, and the organisation is using self hosted runners.
I think it's unlikely that the "infer-urls" feature caused the issue then. 🤔 Just to confirm, it would be great if you could execute this on your self-hosted runner and let me know what the output is.
- run: echo $GITHUB_API_URL
> Run echo $GITHUB_API_URL
https://api.github.com/
> Run echo $GITHUB_API_URL https://api.github.com/
This is what I would expect, so I don't think the "infer-urls" feature is the problem.
Perhaps this is proxy related. From the log it looks like you are behind a proxy for https requests. Please try setting the proxy explicitly like this: https://github.com/peter-evans/create-pull-request?tab=readme-ov-file#proxy-support
It's possible that a library I was using has updated and somehow no longer automatically captures your proxy settings.
@peter-evans I tried setting the https_proxy
. Unless I’m doiong something wrong, the proxy isn’t the issue.
Run peter-evans/create-pull-request@v6
with:
base: main
...
env:
httpProxy: http://172.17.0.1:3128
https_proxy: http://172.17.0.1:3128
Create or update the pull request
Attempting creation of pull request
Error: fetch failed
The workflow you posted starts with jobs:
. Please can you show me the full workflow. Are you setting any environment variables at the top?
Also, please could you confirm that when you test v6
there are no other changes to the workflow, you are literally just changing the version of the action.
I'm leaning towards this being a proxy handling connection issue due to some changes to a library I'm using, but I'll try to figure this out! 😄
Not sure if this will fix it, but it's worth a shot. Please try this version of the action.
uses: peter-evans/create-pull-request@bump-proxy-agent
The workflow you posted starts with
jobs:
. Please can you show me the full workflow. Are you setting any environment variables at the top?
I am not setting anything extra in the workflow.
on:
push:
branches:
- main
pull_request:
branches:
- '**'
jobs:
documentation:
...
The runner is setting the $http_proxy
and $https_proxy
env vars in the shell that the run is being run in.
As you can see from the logs the action is being run with https_proxy
and httpProxy
being set to a value
I’ve added the following step that creates the httpProxy
var from the $https_proxy
shell var to pass to the action ie:
- name: Set up job
run: |
echo "httpProxy=${https_proxy}" >> $GITHUB_ENV
uses: peter-evans/create-pull-request@bump-proxy-agent
...
env:
https_proxy: ${{ env.httpProxy }}
Also, please could you confirm that when you test
v6
there are no other changes to the workflow, you are literally just changing the version of the action.
That is correct.
I'm leaning towards this being a proxy handling connection issue due to some changes to a library I'm using, but I'll try to figure this out! 😄
I’ve tried the bump-proxy-agent
version and still getting “Fetch failed”.
Perhaps it’s worth trying to get more information from the error in the console to continue to aid the debugging efforts.
Perhaps it’s worth trying to get more information from the error in the console to continue to aid the debugging efforts.
Yes I think we need to try and narrow this down and make a minimal example that reproduces the problem. This must a rare edge case, because no other users have reported this (so far). This action is used a lot, so I would expect most issues to effect many users.
I'll have a think about how we can narrow this down.
Could it be a trust issue? The proxy will be issuing certs signed by the corporate CA. However, if the proxy/http library you’re using to make the call to create the PR doesn’t trust the proxy, that’s why the fetch could be failing. I’ve had this issue with other tools and have had to setup a trust store.
We might want to provide an option in the action to pass a trust store (PEM file) to the relevant lib to test this.
If it helps to track this down, I'm actually testing proxy support for the action in my test suite here: https://github.com/peter-evans/create-pull-request-tests/blob/master/.github/workflows/test-command.yml#L1087-L1134
The test uses this image: https://github.com/peter-evans/forward-proxy
Which is based on: https://github.com/nadoo/glider
It seems to work fine. I don't really know much about proxies and trust stores so I'm not sure if that could be the issue.
Hi @peter-evans any update on this? I started facing this issue when I update to v6. I was using v5, but as reported in https://github.com/peter-evans/create-pull-request/issues/2790 , the PR creation step was failing.
Hi @SatwaniGovind
No, there is no update because I'm unable to reproduce this, and don't have a good idea of what the problem could be.
Perhaps you could help by explaining your use case in detail.
Create or update the pull request Attempting creation of pull request Error: fetch failed
Am having the same issue with self-hosted runners. Proxy is set in the runner itself
name: create-pr
on:
workflow_dispatch:
push:
branches:
- 'main'
permissions:
contents: write
pull-requests: write
jobs:
updates:
timeout-minutes: 10
runs-on: [selfhosted-runner]
steps:
- name: Checkout
uses: actions/checkout@v4.1.1
- name:
run: echo "something" > README.md
- name: Create PR
uses: peter-evans/create-pull-request@v6.0.1
with:
token: ${{ secrets.GITHUB_TOKEN }}
commit-message: "chore(deps): update readme"
branch: feat/update-readme
title: "chore(deps): update readme"
labels: dependencies
delete-branch: true
So, if a non-self-hosted runner is used, will the wf run successfully? @raphperrin
@SatwaniGovind yes or rollback to v5
To give us more time to figure this out, I've backported the fix for this issue to v5
. So feel free to use v5
if it works for you as a workaround for this issue.
It looks like all of you are using self-hosted. So I think it's likely that this is affecting some particular setup for self-hosted instances.
Please could you give me information about the version of the Actions runner you are using. Also, if you could provide me with a Dockerfile
for your self-hosted instance, that might be even better.
I'm still thinking that this could be proxy related. So any further details about that setup/configuration on the runner would also be great.
I've just tested my own self-hosted runner here: https://github.com/peter-evans/create-pull-request-tests-self-hosted
It works fine with v6
. See run: https://github.com/peter-evans/create-pull-request-tests-self-hosted/actions/runs/8360055456
So this issue is not affecting all self-hosted setups, just some specific configurations it seems. Please share further details so we can narrow this down.
Thanks @peter-evans for your efforts here. I’ll try to share what I can, but I think most people who use self-hosted runners are corporates so it’s hard to share material. Chicken meet egg.
Thanks @peter-evans for the fix in v5.0.3 . I just noticed one thing when I switched to v5.0.3 -- If a PR already exists from the same branch and the wf tries to update the same PR, the wf doesn't fail (like it was in v5) but it is not reflecting the new changes as well.
I just noticed one thing when I switched to v5.0.3 -- If a PR already exists from the same branch and the wf tries to update the same PR, the wf doesn't fail (like it was in v5) but it is not reflecting the new changes as well.
I've not made any other changes to v5
apart from backporting the fix, so any issue you may have is completely unrelated. Let's keep the discussion here on topic.
I also have this problem in one of my runner environments. I have bisected the fault back to https://github.com/peter-evans/create-pull-request/commit/21d8ea09d56b55e7381af60c0427786e5b3948df. My other working environment runs within a Kubernetes cluster, but interacts with the same GitHub Enterprise server. In the working environment it is not required to use a proxy and my GHES URL is included in noProxy
, whereas in my malfunctioning environment a proxy must be used to reach GHES. http_proxy
, https_proxy
, HTTP_PROXY
, HTTPS_PROXY
, no_proxy
and NO_PROXY
are correctly set. Using the node16
based action works, while running the commit introducing the node20
runtime triggers the failure.
Edit 1: I have captured all outgoing network traffic and for the final "create pull request" api call the action appears to not be respecting my proxy configuration but instead attempts to directly connect to my GHES which is not possible in this environment. All other interactions with my GHES before the failing api request are done using my HTTP proxy.
If it helps to track this down, I'm actually testing proxy support for the action in my test suite here: https://github.com/peter-evans/create-pull-request-tests/blob/master/.github/workflows/test-command.yml#L1087-L1134
The test uses this image: https://github.com/peter-evans/forward-proxy
Which is based on: https://github.com/nadoo/glider
It seems to work fine. I don't really know much about proxies and trust stores so I'm not sure if that could be the issue.
Does this actually test if all requests are made using the proxy. I think you would need to block all direct connections to github.com
for this test to catch the issue i am observing. I can also reproduce this without a GHES but instead with a self-hosted runner and the normal github.com
instance.
also noticed same issue on self-hosted runners ( behind proxy).
Maybe this case apply here:
Yes, in v8 of @octokit/request we switched to the Fetch API instead of node-fetch.
All usages of NodeJS based http(s).Agent will not work as they are incompatible with the fetch API.
Octokit changed the behaviour of proxies as detailed in https://github.com/octokit/rest.js/issues/43. Switching to undici
should work:
(src/octokit-client.ts
)
import {ProxyAgent, fetch as undiciFetch} from 'undici'
const proxyFetch = (proxyUrl: string): typeof undiciFetch =>
(url, opts) => {
return undiciFetch(url, {
...opts,
dispatcher: new ProxyAgent({
uri: proxyUrl,
keepAliveTimeout: 10,
keepAliveMaxTimeout: 10,
})
})
}
export const getOctokit = (options: OctokitOptions & {baseUrl: string}) => {
const baseUrl = options.baseUrl
const proxyUrl = getProxyForUrl(baseUrl)
const OctokitWithPlugins = Core.plugin(paginateRest, restEndpointMethods)
const allOptions: OctokitOptions = {
...options,
baseUrl,
request: proxyUrl ? {fetch: proxyFetch(proxyUrl)} : undefined
}
return new OctokitWithPlugins(allOptions)
}
I tried creating a proxy setup with mitmproxy using which I could test this out properly but failed at that
@sdolender @JannikWibkerQC Thank you for pointing out this change in octokit. This is most likely the cause of the problem.
I've made a feature branch to test this. It seems to pass the tests I have for proxy support. https://github.com/peter-evans/create-pull-request/compare/main...proxy-fix
Please test this version of the action and let me know if it solves the issue:
uses: peter-evans/create-pull-request@proxy-fix
The proxy-fix
version resolves the issue for our behind-a-proxy runner 🎉
The fix also works for me
Thank you for testing!
The fix is released as v6.0.5
/ v6
.
I can also reproduce this without a GHES but instead with a self-hosted runner and the normal
github.com
instance.
@0xbe7a Please can you show me how you are reproducing the issue. I would like to correct my regression tests so that I can catch similar issues in future.
I tried to block direct connections, as you suggested, but it still seems to pass the test for v6.0.4
. https://github.com/peter-evans/create-pull-request-tests/commit/66679f5393dbf6a22d4612afcd0f4c7d3867b9f1
@0xbe7a Please can you show me how you are reproducing the issue. I would like to correct my regression tests so that I can catch similar issues in future.
I have not attempted to reproduce this with a GitHub hosted runner. Our self-hosted runner only has access to our http_proxy and all other outgoing permissions are blocked.
Looking at the test-setup i am not sure if the iptables rules is sufficient here. First from what i understand is that iptables resolves the domain name once at rule-creation time. From the manpage:
Hostnames will be resolved once only, before the rule is submitted to the kernel. Please note that specifying any name to be resolved with a remote query such as DNS is a really bad idea.
As github.com
and api.github.com
already resolve to different IPs, this is most likely not going to catch all connections.
I've fixed my proxy support test now. https://github.com/peter-evans/create-pull-request-tests/blob/32f87b10665d76f09808a07b2c4623a678b6bf89/.github/workflows/test-command.yml#L1087-L1118
This test now fails with v6.0.4
, reproducing the fetch failed
error. The latest version, v6.0.5
passes the test.
Apologies for breaking proxy support. This fix of the test should prevent it from happening again.
That's pretty neat, thank you for your great effort here ❤️
@peter-evans Can confirm the fix works for me too. Thanks for your efforts here. 👍
Subject of the issue
My attempt to create a PR is failing with the error message. My use case is when the generated Swagger doc changes for my project, a PR is opened to merge the changes.
This was working, but has recently stopped. The branch is created and pushed successfully. It’s just the PR step that is failing.
Steps to reproduce