aboutcode-org / purldb

Tools to create and expose a database of purls (Package URLs). This project is sponsored by NLnet project https://nlnet.nl/project/vulnerabilitydatabase/ and nexB for https://www.aboutcode.org/ Chat is at https://gitter.im/aboutcode-org/discuss
https://purldb.readthedocs.io/
35 stars 23 forks source link

debian visitor: do not unnecessarily process ls-lR files #5

Open armijnhemel opened 2 years ago

armijnhemel commented 2 years ago

Currently the Debian visitor seems to download and process the ls-lR files. Although these change on an almost daily basis it might be wise to add one extra step to calculate a checksum of the ls-lR file that was downloaded and verify if there is a checksum from a previous invocation. If there is none, continue processing. If there is one, and it doesn't match, continue processing. If there is one and it matches, exit.

After processing the checksum should be stored in a persistent (configurable?) location.

pombredanne commented 1 year ago

Thanks! For now there is a little to no attention paid to optimization and avoiding redoing work.... This should be applied ideally in a cross-cutting layer.