For PURLs we do have an optimized importer by now. A list of PURLs from an SBOM get's batch importer, with an upsert strategy.
However, for CPEs we still have the single "get or insert" strategy. As there seem to be a lot of CPEs in the SBOMs now, that hurts performance a lot.
The idea is to replicate the ingestion process from PURLs and apply the same pattern to CPEs. Batch insertion, plus upsert. A quick check for a single RHEL style SBOMs shows that this should bring down operations quite a bit, just by avoiding duplicates:
On the other hand, those CPEs are of type "security" and we can skip them at first. Also see: https://github.com/trustification/trustify/issues/509 … However, in the future we might want to ingest this information anyway. So we need to improve the CPE creationg process.
I'm now using the pih CPEs to contextualize product-status from CSAF, so yes please, CPEs would be good. I'm currently relying upon graph.ingest_cpe(...)
For PURLs we do have an optimized importer by now. A list of PURLs from an SBOM get's batch importer, with an upsert strategy.
However, for CPEs we still have the single "get or insert" strategy. As there seem to be a lot of CPEs in the SBOMs now, that hurts performance a lot.
The idea is to replicate the ingestion process from PURLs and apply the same pattern to CPEs. Batch insertion, plus upsert. A quick check for a single RHEL style SBOMs shows that this should bring down operations quite a bit, just by avoiding duplicates:
On the other hand, those CPEs are of type "security" and we can skip them at first. Also see: https://github.com/trustification/trustify/issues/509 … However, in the future we might want to ingest this information anyway. So we need to improve the CPE creationg process.