Open pombredanne opened 1 year ago
I agree with the layer as package approach.
My first thoughts are that I think typically results approach 2) would be more useful, but I agree that it may lead to aberrations. I haven’t formed my mind about that yet (and perhaps my technical skills are not at the point to do so either).
For reference see also
@silverhook you wrote:
I agree with the layer as package approach.
My first thoughts are that I think typically results approach 2) would be more useful, but I agree that it may lead to aberrations. I haven’t formed my mind about that yet (and perhaps my technical skills are not at the point to do so either).
The thing is that every layer may contain package databases or metadata for every layers below but a layer contains the actual installed package bits if and only if the package was installed in this specific layer. So the metadata duplication is an artifact of the layering, but it can also be subtle as it express itself in multiple ways: a package can be added, removed or updated (a remove/add in practice).
I am worried about:
It would be useful to treat each layer in a docker images as a package of its own. Why? They are a thing that can be fetched individually and even if a single layer is not of much value alone, this can technically be used alone and when stored as a package (say in the purldb) this becomes something that can be reused (e.g., reuse the scan, analysis, etc.). Of course if we start treating each layer as a "package" the approach to combining the results of multiple overlaid layers would change as we would have possibly two ways of scanning a layer (and therefore two different scan contents:
I am not sure a layer can ever be reused in abstract of its parent layer or rather not always as this would lead to aberrations, so there is some research to do there before committing to one or the other approach.
These would be some of the actual specific issues to work out: