aboutcode-org / vulnerablecode

A free and open vulnerabilities database and the packages they impact. And the tools to aggregate and correlate these vulnerabilities. Sponsored by NLnet https://nlnet.nl/project/vulnerabilitydatabase/ for https://www.aboutcode.org/ Chat at https://gitter.im/aboutcode-org/vulnerablecode Docs at https://vulnerablecode.readthedocs.org/
https://public.vulnerablecode.io
Apache License 2.0
543 stars 201 forks source link

Avoid memory exhaustion during data migration #1630

Closed keshav-space closed 3 weeks ago

keshav-space commented 3 weeks ago

There are nearly 15 million package vulnerability relationships in VCIO, and loading them all at once during data migration consumes all the memory.

TG1999 commented 3 weeks ago

@keshav-space thanks for this, can we use our own paginate here?

keshav-space commented 3 weeks ago

@keshav-space thanks for this, can we use our own paginate here?

@TG1999 We cannot use .paginated() in data migrations for PackagerRelatedVulnerability, as this model was created without our custom queryset manager. https://github.com/aboutcode-org/vulnerablecode/blob/289f4b823b6be636216bae04a637b770a71d1f29/vulnerabilities/models.py#L874-L972

Also, afaik custom querysets are not directly available in data migrations. In this case it would be better to use the built-in queryset iterator.

keshav-space commented 3 weeks ago

Merging this now!