Reduce calling GitHub API (avoid GitHub API rate limit exceeded)

ikedam commented 2 weeks ago

Introduction

Github API rate limit issue is a critical issue even though there're guids to avoid that issue (details later). Once CI/CD hits that rate limit, the development cannot continue for 1 hour.

I want to resolve this issue by reducing calls of GitHub API. tflint --init calls following two APIs:

Get a release metadata by Get a release by tag name
Download a release asset by Get a release asset
- Downloads the checksum and the plugin zip file.

Proposal

Introduce two new features:

Cache the release metadata.
- It's a text data and easy to contain it in the code repository.
Download a release asset with browser_download_url of the release metadata.

References

Nothing

Notes about Github API rate limit issue

CI/CD environment hits the rate limit, and cannot work for 1 hours, until the limit resets. This may affect other workloads sharing the environment. It's too risky just for linting. This could be a reason to hesitate introducing tflint.
Authenticating GitHub API is not a so easy resolution. Requires a GitHub user for CI/CD environment and manage secrets. Managing a user of a public service is often troublesome in enterprise developments for governance reasons.
Caching plugin directories in CI/CD has challenges:
- Caching usually works as expected only when the run succeeds. Successive Successive build failures before caching works may hit the rate limit issue.
- The document explains how to use cache in GitHub actions. Developers using CI/CD except GitHub actions (e.g. GitLab CI) have to study how they can cache in their environments. It costs too much to introduce tflint for those developers.

bendrucker commented 2 weeks ago

You probably should have waited to discuss this before putting so much work into a PR that is hundreds of lines long. This is precisely the cost of caching features (code complexity).

Before we even look at code let's focus on significantly sharpening the case for this feature's existence. How many API requests do your runs make now? How many would they make under your proposed change?

Is this only useful when you run CI in a non-isolated environment, where runs share persistent disk?

TFLint is free to use and not corporate sponsored so I think we're generally ok with the idea that plugins are visibly hosted by GitHub and so their auth requirements apply if you want a reliable CI pipeline.

wata727 commented 2 weeks ago

I'm concerned that introducing a new mechanism just to reduce API calls will unnecessarily complicate configuration options. Basically, I would like to encourage you to either create a GitHub token or cache your plugin directory.

ikedam commented 2 weeks ago

@bendrucker @wata727 Though I know the workarounds with GitHub token and caching plugins, it's difficult to apply that to every projects using terraform. I want another option to avoid rate limit (as described in "Notes about Github API rate limit issue").

I have a project with about 10 terraform repositories sharing CI environments (with a fixed IP address). I want to lint them with aws plugin and google plugin. I believe this results 6 API calls for each CI runs. It often hit the rate limit before introducing the caching workaround. I also plan to apply tflint to another project and it has about 5 terrafotm repositories but with another CI system. It costs too much that I need set up Github token or caching, only to introduce a lint tool. I want a lint tool more easy to introduce.

Is this only useful when you run CI in a non-isolated environment, where runs share persistent disk?

That's really right. And I believe tflint is expected to be used in CI, projects/repositories often shares CI environments (especially share the IP address in enterprise environments), and CI machines start with clean environments for each runs for auto-scaling (that means they don't have persistent disks). This is really only useful for specific cases, but I believe those specific cases are often the cases.

The most part of complexity is to preserve backward compatibility and for the fallback behavior. Codes can be more simple to make this behavior default one.

bendrucker commented 2 weeks ago

It costs too much that I need set up Github token

Well yeah, that's my point above. The convenience of a public registry that accepts anonymous requests is obvious.

I am not convinced that this is so burdensome. It needs no permissions, it just needs to be a valid identity on GitHub. Give it to every CI job by default?

Clearly organized details about your actual systems can be helpful in making your case. The marketing pitch isn't, as free software doesn't owe you convenience. A PR is a nice gesture but contributes a small fraction of the total cost of this change.

that means they don't have persistent disks

This is not what I mean by persistent disk in the context of CI. Persistence is with respect to the jobs. Whether the disk is actually permanent block storage or ephemeral instance storage is immaterial. It creates new surface area when you can have multiple concurrently executing jobs possibly writing to the same location on disk.

The most part of complexity...

It's a 700 line PR. We're concerned about the number of details to get right, the review iterations, the likelihood of bugs, and the long term maintenance cost. Trying to convince us it's pretty simple does the reverse.

wata727 commented 2 weeks ago

Considering your requirements, it seems like a good idea to store plugin binaries and install it manually, rather than caching tflint --init. Another option would be to support registries other than GitHub releases as mentioned in https://github.com/terraform-linters/tflint/issues/1202#issuecomment-1377441156. Personally, I would rather spend my time on this than introducing new options.

ikedam commented 1 week ago

Sorry for my late response.

OK. I got that this idea is not acceptable one for continuous maintenance of tflint. I can't make the changeset simpler for this idea.

I'd consider another option, like storing binaries in repositories as proposed.

It would be really nice if #1201 was introduced and the official plugins were hosted also on another place or I could host them on another place in the private network.

free software doesn't owe you convenience

I agree that.

terraform-linters / tflint