quay / claircore

foundation modules for scanning container packages and reporting vulnerabilities
https://quay.github.io/claircore/
Apache License 2.0
142 stars 85 forks source link

Could an error recognising distro on initial scan cause a permanent problem #622

Open paulaldridge opened 2 years ago

paulaldridge commented 2 years ago

Not sure if this is an issue or not but found an interesting situation which I thought was worth sharing:

So what was happening was that the layer (FROM debian:stretch-slim) was being skipped each time, as clair had already scanned it, which meant it was fixed in it’s decision that there was no recognisable repo. I’m not sure what caused clair to mess up the initial scan, but it’s concerning that it might be able to happen. I’m not sure how we’d know when it does, or even how we’d sensibly trigger a rescan if we do know - think you’d need to clear the related layer_scanned and manifest_scanned records and then re-push the manifest to clair again for each effected image. 



Onto how/whether it could happen, I have 2 hunches:

  1. When looking for a distro, e.g. debian, any error that is returned from the Files function is assumed to be because none of the requested files are found. But there seems to be a variety of potential errors which may not necessarily mean the file doesn’t exist in the layer (e.g. an error from reader: fail to fetch layer, or failure to open tar). If all of these errors are permanent, and so re-scanning the layers wouldn’t ever help, then assuming the files don’t exist/aren’t available does seem correct. However, if they may be transient errors maybe we should fail the scan on some of these, as to not commit a bad scan to the database forever.
  2. Another possibility is that someone from our team deleted the distro record for this layer from the db manually by mistake (as this was in our dev db where some manually playing has happened). Even if this could be the case, thought it was worth discussing the possibility of a gap where bad scans being could be committed to db
paulaldridge commented 2 years ago

Just recording that I've noticed this again with another layer. In all but one of our environments an image was showing as having no distro from the index report. The one environment that was working correctly identified the distro from the FROM alpine:latest layer. Querying against a specific layer hash (as alpine:latest isn't a fixed layer) I can see that the working environment has a distro marked for that layer, however other environments have no distro listed for the layer hash.

This seems to support the theory that an error during distro recognition on initial scan could incorrectly mark a layer as having no distro.