Closed ikedam closed 1 week ago
You probably should have waited to discuss this before putting so much work into a PR that is hundreds of lines long. This is precisely the cost of caching features (code complexity).
Before we even look at code let's focus on significantly sharpening the case for this feature's existence. How many API requests do your runs make now? How many would they make under your proposed change?
Is this only useful when you run CI in a non-isolated environment, where runs share persistent disk?
TFLint is free to use and not corporate sponsored so I think we're generally ok with the idea that plugins are visibly hosted by GitHub and so their auth requirements apply if you want a reliable CI pipeline.
I'm concerned that introducing a new mechanism just to reduce API calls will unnecessarily complicate configuration options. Basically, I would like to encourage you to either create a GitHub token or cache your plugin directory.
@bendrucker @wata727 Though I know the workarounds with GitHub token and caching plugins, it's difficult to apply that to every projects using terraform. I want another option to avoid rate limit (as described in "Notes about Github API rate limit issue").
I have a project with about 10 terraform repositories sharing CI environments (with a fixed IP address). I want to lint them with aws plugin and google plugin. I believe this results 6 API calls for each CI runs. It often hit the rate limit before introducing the caching workaround. I also plan to apply tflint to another project and it has about 5 terrafotm repositories but with another CI system. It costs too much that I need set up Github token or caching, only to introduce a lint tool. I want a lint tool more easy to introduce.
Is this only useful when you run CI in a non-isolated environment, where runs share persistent disk?
That's really right. And I believe tflint is expected to be used in CI, projects/repositories often shares CI environments (especially share the IP address in enterprise environments), and CI machines start with clean environments for each runs for auto-scaling (that means they don't have persistent disks). This is really only useful for specific cases, but I believe those specific cases are often the cases.
The most part of complexity is to preserve backward compatibility and for the fallback behavior. Codes can be more simple to make this behavior default one.
It costs too much that I need set up Github token
Well yeah, that's my point above. The convenience of a public registry that accepts anonymous requests is obvious.
I am not convinced that this is so burdensome. It needs no permissions, it just needs to be a valid identity on GitHub. Give it to every CI job by default?
Clearly organized details about your actual systems can be helpful in making your case. The marketing pitch isn't, as free software doesn't owe you convenience. A PR is a nice gesture but contributes a small fraction of the total cost of this change.
that means they don't have persistent disks
This is not what I mean by persistent disk in the context of CI. Persistence is with respect to the jobs. Whether the disk is actually permanent block storage or ephemeral instance storage is immaterial. It creates new surface area when you can have multiple concurrently executing jobs possibly writing to the same location on disk.
The most part of complexity...
It's a 700 line PR. We're concerned about the number of details to get right, the review iterations, the likelihood of bugs, and the long term maintenance cost. Trying to convince us it's pretty simple does the reverse.
Considering your requirements, it seems like a good idea to store plugin binaries and install it manually, rather than caching tflint --init
.
Another option would be to support registries other than GitHub releases as mentioned in https://github.com/terraform-linters/tflint/issues/1202#issuecomment-1377441156. Personally, I would rather spend my time on this than introducing new options.
Sorry for my late response.
OK. I got that this idea is not acceptable one for continuous maintenance of tflint. I can't make the changeset simpler for this idea.
I'd consider another option, like storing binaries in repositories as proposed.
It would be really nice if #1201 was introduced and the official plugins were hosted also on another place or I could host them on another place in the private network.
free software doesn't owe you convenience
I agree that.
Introduction
Github API rate limit issue is a critical issue even though there're guids to avoid that issue (details later). Once CI/CD hits that rate limit, the development cannot continue for 1 hour.
I want to resolve this issue by reducing calls of GitHub API.
tflint --init
calls following two APIs:Proposal
Introduce two new features:
browser_download_url
of the release metadata.References
Nothing
Notes about Github API rate limit issue
CI/CD environment hits the rate limit, and cannot work for 1 hours, until the limit resets. This may affect other workloads sharing the environment. It's too risky just for linting. This could be a reason to hesitate introducing tflint.
Authenticating GitHub API is not a so easy resolution. Requires a GitHub user for CI/CD environment and manage secrets. Managing a user of a public service is often troublesome in enterprise developments for governance reasons.
Caching plugin directories in CI/CD has challenges: