Currently travahrv retrieves a resource every time it needs to;
It doesn't look if the resource (URI) was already retrieved in the past.
This results in the same resource being retrieved multiple times which results in long waitng tiles for some tasks that have a lot of assertion paths that need to be traversal harvested.
A solution for this can be looking at the execution report and retrieving all resources that were harvested already together with their date of harvest and mimetype to assure that all diff mimetypes were harvested.
With this a cache can be made that travharv can use.
Currently travahrv retrieves a resource every time it needs to; It doesn't look if the resource (URI) was already retrieved in the past. This results in the same resource being retrieved multiple times which results in long waitng tiles for some tasks that have a lot of assertion paths that need to be traversal harvested.
A solution for this can be looking at the execution report and retrieving all resources that were harvested already together with their date of harvest and mimetype to assure that all diff mimetypes were harvested.
With this a cache can be made that travharv can use.