palantir / policy-bot

A GitHub App that enforces approval policies on pull requests
Apache License 2.0
774 stars 106 forks source link

Error loading policy from <org>/.github #405

Closed willholley closed 2 years ago

willholley commented 2 years ago

After updating policy-bot from 1.23.2 to 1.24.0 I started seeing failing checks in repositories that previously ignored policy-bot.

For context, our deployment connects to GitHub Enterprise and is enabled on all repositories in our organization. However, only one repository actually has a .policy.yml file. There is no default / shared policy file (no .github repository in the organization) and so policy-bot was previously just ignored.

What I see after upgrading is a failing check for all repositories in the organization that do not define a .policy.yml file and an error:

Error loading policy from cloudant/.github

For example:

policy-bot-check

My expectation is that the behaviour would be as described at in the README:

If a policy does not exist in the repository or in the shared organization repository, policy-bot does not post a status check on the pull request. This means it is safe to enable policy-bot on all repositories in an organization.

However, looking through the code at https://github.com/palantir/policy-bot/blob/develop/server/server.go#L139 and https://github.com/palantir/policy-bot/blob/develop/server/handler/base.go#L83-L88, it seems like it will always attempt to load a shared policy from the default location, contradicting the documentation?

bluekeyes commented 2 years ago

Sorry you're hitting this error. If you click on the "Details" link, it should show an expanded error with more context about exactly what failed. If you can share that, it should help explain what is happening. It would also help to know which version of GitHub Enterprise you are using.

As far as I know, the default policy functionality works as described in our environment, so your environment might be exposing a case we did not account for.

willholley commented 2 years ago

the error under details doesn't appear to be super-helpful:

Error

Invalid policy at :

I don't see anything useful in the policy-bot logs but perhaps enabling debug logging will yield something

willholley commented 2 years ago

actually looking further back in the logs I see:

failed to get default repository: Get "https://<GHE URL>/repos/cloudant/.github": net/http: request canceled (Client.Timeout exceeded while awaiting headers)
github.com/palantir/go-githubapp/appconfig.(*Loader).loadDefaultConfig
    /policy-bot/vendor/github.com/palantir/go-githubapp/appconfig/appconfig.go:225
github.com/palantir/go-githubapp/appconfig.(*Loader).LoadConfig
    /policy-bot/vendor/github.com/palantir/go-githubapp/appconfig/appconfig.go:155
github.com/palantir/policy-bot/server/handler.(*ConfigFetcher).ConfigForPR
    /policy-bot/server/handler/fetcher.go:43
github.com/palantir/policy-bot/server/handler.(*Base).Evaluate
    /policy-bot/server/handler/base.go:175
github.com/palantir/policy-bot/server/handler.(*PullRequestReview).Handle
    /policy-bot/server/handler/pull_request_review.go:45
github.com/palantir/go-githubapp/githubapp.Dispatch.Execute
    /policy-bot/vendor/github.com/palantir/go-githubapp/githubapp/scheduler.go:56
github.com/palantir/go-githubapp/githubapp.(*scheduler).safeExecute
    /policy-bot/vendor/github.com/palantir/go-githubapp/githubapp/scheduler.go:183
github.com/palantir/go-githubapp/githubapp.QueueAsyncScheduler.func1
    /policy-bot/vendor/github.com/palantir/go-githubapp/githubapp/scheduler.go:257
runtime.goexit
    /usr/lib/golang/src/runtime/asm_amd64.s:1371
willholley commented 2 years ago

Having attempted to reproduce, I think this was just a transient problem with the GitHub API timing out. It's unfortunate that when this occurs, this results in failed policy-bot checks in repositories that previously ignored it, but this seems like the safest failure mode.

NargiT commented 1 year ago

The default behavior is problematic when the repository does not have any policy file. We have policy install on our organization but not every repository used it. When we have a timeout from github, it generates a failed status check. Is it possible to reopen this issue and find a fix to avoid blocking people's PR ?

Is this a normal behavior for repositories without any policy files ?

bluekeyes commented 1 year ago

@NargiT I've created #506 to describe the issue where Policy Bot posts unexpected statuses due to GitHub errors. Unless the status is a required check, it should not block people's PRs, but I understand that seeing a failed status can be confusing and could cause problems for other tools that read statuses.