Open pombredanne opened 1 year ago
@johnmhoran @JonoYang @steven-esser @AyanSinhaMahapatra @mjherzog @Pratikrocks ping .... input welcome!
@pombredanne @arnav-mandal1234 I made some edits above at https://github.com/nexB/deltacode/issues/183#issue-1623253377, looks great otherwise! :rocket:
So here is the outline of the discussion @arnav-mandal1234 and I had to revive and update DeltaCode!
[ ] We need to update DeltaCode and scancode-fingerprint plugin at https://github.com/nexB/scancode-plugins/tree/main/misc/scancode-fingerprint to the latest standard
[ ] Then we would like to merge DeltaCode in the core ScanCode-toolkit git repo, preserving the commit history, and update it to become CLI options in ScanCode-toolkit. The commit history will be helpful to preserve changes as well as authorship. Once done, we can selectively move issues to ScanCode-toolkit and archive this repo. https://github.com/nexB/deltacode/issues/181
[ ] We will need to add support for comparing packages and focusing the delta capabilities on package scans (rather than mostly files)
[ ] Finally I would like to see DeltaCode integrated in purldb as a library to support two use cases:
[ ] Extend package curations: given a package v1 with reviewed license/origin and a new v2 of the same package, are the difference of package metadata, codebase summaries and file level delta such that we can carry forward the review of v1 to v2? or should these be reviewed again?
[ ] Cluster package to focus curations: given a series of package version v1 to v10, what are the cluster of versions that have essentially similar package metadata, codebase summaries and file level data? and given these clusters which are the key versions to review to validate a whole cluster at once?