crawler-commons / url-frontier

API definition, resources and reference implementation of URL Frontiers
Apache License 2.0
44 stars 11 forks source link

Crawlid #47

Closed jnioche closed 2 years ago

jnioche commented 2 years ago

This PR adds the concept of crawlID to URLFrontier.

This gives it multi-tenancy as a Frontier instance will be able to handle multiple crawls, identified by a crawlID. A given URL is associated with a given crawl and any operation on a URL does not affect the same URL in other crawls.

This PR modifies the API and the implementation of the service is not compatible versions <= 1.0