aquasecurity / trivy

Find vulnerabilities, misconfigurations, secrets, SBOM in containers, Kubernetes, code repositories, clouds and more
https://aquasecurity.github.io/trivy
Apache License 2.0
23.06k stars 2.28k forks source link

feat(misconfig): Add flag to supply checks bundle locally #7029

Closed simar7 closed 3 months ago

simar7 commented 3 months ago

Today we have the --checks-bundle-repository flag that can be use to allow Trivy to download a bundle from a specified registry URL.

However there can be instances where a bundle is built locally and needs to be used without pushing up to a registry. For this purpose we can add a --checks-bundle-path to specify a bundle available from disk.

This is also needed for automating testing of every trivy-checks bundle that we release within the same pipeline on every PR.

knqyf263 commented 3 months ago

I may be misguided as I don't understand the context very well, but I will leave a few comments.

However there can be instances where a bundle is built locally and needs to be used without pushing up to a registry.

For testing purposes, you can put the checks bundle under /path/to/cache/policy and update the metadata so it looks like the latest one. https://github.com/aquasecurity/trivy/blob/0ccdbfbb6598a52de7cda603ab22e794f710e86c/pkg/policy/policy.go#L86 https://github.com/aquasecurity/trivy/blob/0ccdbfbb6598a52de7cda603ab22e794f710e86c/pkg/policy/policy.go#L163-L166

That's what we're doing for the vulnerability DB.

https://github.com/aquasecurity/trivy/blob/0ccdbfbb6598a52de7cda603ab22e794f710e86c/integration/integration_test.go#L58-L72

If you really need to configure the bundle path, we might want to introduce a local repository similar to Maven. Then, use --checks-bundle-repository file:///path/to/local/bundle. https://maven.apache.org/guides/introduction/introduction-to-repositories.html

simar7 commented 3 months ago

Yeah we could do those changes to get around, I just thought that it would be nice (and probably easier for the user) to have a flag.

On a related note, my original motivation for adding a flag also partially came from reading the air gap install procedure we have today. It seems a little involved to me to have to poke around the file system given that we could enable the user to provide the database (or a bundle in case of misconfiguration scanning) to us with a flag.

knqyf263 commented 3 months ago

On a related note, my original motivation for adding a flag also partially came from reading the air gap install procedure we have today. It seems a little involved to me to have to poke around the file system given that we could enable the user to provide the database (or a bundle in case of misconfiguration scanning) to us with a flag.

--cache-dir can configure the cache directory. Do you mean we should provide more granular options? In general, I'd keep DB-related flags consistent with the bundle flags. If we provide --checks-bundle-path, we should provide --db-path, --java-db-path, etc.

simar7 commented 3 months ago

Do you mean we should provide more granular options?

Yes. Although as you mentioned earlier, we can add maven style flags which enable file:// and https:// for local and remote.

In general, I'd keep DB-related flags consistent with the bundle flags

Yes likewise, I think we should add such a flag for both DB and bundle. If we go the maven style route (to re-use the existing flag), we can ensure that it is backwards compatible. If not, adding a new flag for both DB and bundle.

knqyf263 commented 3 months ago

However there can be instances where a bundle is built locally and needs to be used without pushing up to a registry.

I generally understand the main points, but I'd like to delve a bit deeper into why the --cache-dir option is insufficient as the requirement you shared above is achieved with --cache-dir. I've thought of the following items, but since I haven't heard any specific complaints from users, I'd like to confirm your thoughts:

  1. Currently, the path after --cache-dir is fixed. For example, the vulnerability database must be stored in $CACHE_DIR/db/trivy.db, where the path db/trivy.db is fixed. Similarly, the checks bundle is fixed to $CACHE_DIR/policy/content. There may be a need to set these freely. For instance, something like --checks-bundle-path /myproject/checks.

  2. Currently, with --cache-dir, the root directory is fixed. There might be use cases where users want to place the vulnerability database and checks bundle in completely different paths. For example: --cache-dir /tmp/trivy --db-path /mycache/db/ --checks-bundle-path /myproject/checks

While these options would be convenient, considering the compatibility with trivy clean --all, it might be more convenient to place everything under $CACHE_DIR. On the other hand, --checks-bundle-path might be necessary precisely because users don't want it to be deleted as cache.

I'm generally open to this idea, but I want to understand the requirements correctly. For instance, if the first point is the main reason, it might be better to specify a relative path after $CACHE_DIR. Depending on the requirements, there might be different solutions.

simar7 commented 3 months ago

Sorry but yes you're right there are two different things to solve here.

  1. Cache directory (temporary, not user provided, e.g. scan results, plugin index, metadata etc.)
  2. Artifact directory (permanent, user provided, e.g. checks bundle and trivy db)

Today we basically use the cache directory for both purposes. Which is fine and for the purposes of trivy clean works better as we only have one location to clean up. For most purposes, a user IMO is less interested in knowing where the metadata or plugin index lives as compared to where the checks bundle or policy db is picked up from because the former cannot be supplied by them, but the latter can be.

My suggestion is simply only addressing the fact that today users need to understand the structure of cache dir (to be able to place artifacts in there) and to me that seems a little involved. As for caching the supplied bundle (and trivy db), we could still very well place it within the cache dir, which works well with trivy clean as we discussed above.

As far as testing is concerned, none of this matters as we can set it up the way you described above.

Let me know if it makes sense!

knqyf263 commented 3 months ago

My suggestion is simply only addressing the fact that today users need to understand the structure of cache dir (to be able to place artifacts in there) and to me that seems a little involved.

Yes, I agree. The more artifacts that are cached, the more difficult it becomes to understand the directory structure. Is this a problem that can be avoided by careful documentation? Or would it be better to have a flag specifying a local path for each artifact, as you originally suggested?

As far as testing is concerned, none of this matters as we can set it up the way you described above.

Right, we are familiar with the directory structure, so we can just put the checks bundle (and other artifacts) in a defined path for testing.

So I think our focus in this discussion should be on the issue of users having to know the directory structure in an air-gapped environment. The --cache-dir flag should be sufficient in all but air-gapped environments, as artifacts are automatically downloaded and placed in the appropriate path.

@DmitriyLewen @nikpivkin Please feel free to share your thoughts.

DmitriyLewen commented 3 months ago

I'm not sure we need this flag. We already have a large number of flags and it’s better to try to make do with them (and teach users this).

I don't have much experience with checks-bundle, but in terms of source code, checks-bundle is very similar to trivy-db (we have our own repository, we check and update checks-bundle before every run). So i would say that it will be enough to update docs (I think we have written quite detailed instructions for installing trivy-db (https://aquasecurity.github.io/trivy/v0.52/docs/advanced/air-gap/)).

On the other hand, if checks-bundles are not updated frequently and users often use their own checks-bundles- perhaps we can treat checks-bundles as plugins (i mean move them to ~/.trivy). But in this case, we must remove checks-bundle from the trivy clean and, update the logic for updating/removing them.

Second option is appropriate for me, but I am more inclined to the first option.

simar7 commented 3 months ago

Fair enough, thanks for the feedback! Based on this discussion let's start with the documentation first, I will add that in. I will close this issue.

itaysk commented 2 months ago

I'm also wondering about this. we have --db-repository and --java-db-repository flags, why in the air-gapped doc we are telling people to surgically replace the db in the cache and not point trivy at the self-hosted db?

knqyf263 commented 2 months ago

Do you mean they have access to the self-hosted DB in their air-gapped environment? We can mention that case in the doc.

itaysk commented 2 months ago

yes that's what I meant. I guess we created those flags for this use case, but in the doc that was supposed to highlight them it's missing.

knqyf263 commented 2 months ago

As far as I remember, the flag was originally added by GitLab or other companies so they can use their own DB, and was not for air-gapped. But yes, this flag must be useful for such use cases as well. We should document it.