clearlydefined / service

The service side of clearlydefined.io
MIT License
45 stars 40 forks source link

Type Error: "Invalid Version: 2.9.0b1" when computing definition for npm/npmjs/-/debug/3.1.0 #1237

Open qtomlinson opened 2 hours ago

qtomlinson commented 2 hours ago

Expected: No exception when computing definition

Observed: image

qtomlinson commented 2 hours ago

Get call to https://api.clearlydefined.io/harvest/npm/npmjs/-/debug/3.1.0?form=list revealed

[
    "npm/npmjs/-/debug/3.1.0/clearlydefined/1.1.2",
    "npm/npmjs/-/debug/3.1.0/clearlydefined/1.1.3",
    "npm/npmjs/-/debug/3.1.0/clearlydefined/1.2.3",
    "npm/npmjs/-/debug/3.1.0/clearlydefined/1.3.3",
    "npm/npmjs/-/debug/3.1.0/clearlydefined/1.3.4",
    "npm/npmjs/-/debug/3.1.0/clearlydefined/1",
    "npm/npmjs/-/debug/3.1.0/fossology/3.3.0",
    "npm/npmjs/-/debug/3.1.0/fossology/3.4.0",
    "npm/npmjs/-/debug/3.1.0/fossology/3.5.0",
    "npm/npmjs/-/debug/3.1.0/fossology/3.6.0",
    "npm/npmjs/-/debug/3.1.0/licensee/9.11.1",
    "npm/npmjs/-/debug/3.1.0/licensee/9.12.1",
    "npm/npmjs/-/debug/3.1.0/licensee/9.14.0",
    "npm/npmjs/-/debug/3.1.0/reuse/1.3.0",
    "npm/npmjs/-/debug/3.1.0/reuse/3.2.1",
    "npm/npmjs/-/debug/3.1.0/scancode/2.10.2",
    "npm/npmjs/-/debug/3.1.0/scancode/2.2.1",
    "npm/npmjs/-/debug/3.1.0/scancode/2.9.0+b1",
    "npm/npmjs/-/debug/3.1.0/scancode/2.9.1",
    "npm/npmjs/-/debug/3.1.0/scancode/2.9.2",
    "npm/npmjs/-/debug/3.1.0/scancode/3.2.2",
    "npm/npmjs/-/debug/3.1.0/scancode/30.3.0"
]

2.9.0+b1 was indeed a version of scancode in the past. 2.9.0+b1 is not considered valid in both legacy (pre v32) and new (v32) scancode summarizers. This invalid version error was thrown during summarizing the old scancode result.

Despite the presence of old results with invalid versions, newer results are available to compute the component definition. Currently, when calculating the definition for a component, we retrieve all harvest tool results, both new and old, for the given coordinates. We then summarize these results and aggregate only the latest tool results. This presents an opportunity for optimization. By downloading only the most recent tool results, summarizing them, and aggregating only these latest summaries, we can save time and reduce the memory footprint by avoiding the download of unnecessary data.

If 2.9.0+b1 happens to be the only version available for the component, then the error message is justified and can prompt the user to trigger a harvest for the component.