clearlydefined / service

The service side of clearlydefined.io
MIT License
45 stars 40 forks source link

Proactively harvest latest version when computing definitions for older versions #509

Open dabutvin opened 5 years ago

dabutvin commented 5 years ago

In the definitionService after compute and store, we should tell the harvestService to queue the latest version of the package.

For any user that is 'using' ClearlyDefined for their current package lists, chances are they will eventually upgrade. A lot of times, this harvest will be a no-op when we already have the latest versions, but this will keep us ahead of the package upgrades before they come.

fyi @iamwillbar

jeffmcaffer commented 5 years ago

Love the intent but I'd be a little careful here as we'll end up with scenarios where we bulk recompute etc and that would trigger massive queuing for harvesting. That will cause the crawlers to ping repository APIs repeatedly to get the latest of a given package etc.

dabutvin commented 5 years ago

that's a good call out for sure - maybe this should happen on the crawler side only - and leave definition recompute out of it?

jeffmcaffer commented 5 years ago

What would trigger the queuing of the latest in that case?

dabutvin commented 5 years ago

I was thinking, when we crawl an older version, then check and queue the latest version

jeffmcaffer commented 5 years ago

ah, yeah, could do that. Would be a little awkward where to put that. The Crawler really only does fetch and process. Could put it in the ClearlyDefined processor but process generally does not reach out anywhere to get stuff. fetch could attempt to get the latest for the given coordinates but then it would do that for every tool request (not just the ClearlyDefined tool unless we coded that in).

In the end perhaps we should wait to see if this is an issue (lacking new versions of things). We could rely on our monitoring of the ecosystems to detect new versions. That will happen almost for free. If that is failing somehow then perhaps add this complexity?