InternetHealthReport / internet-yellow-pages

A knowledge graph for Internet resources
GNU General Public License v3.0
39 stars 16 forks source link

Add informative reference URL and data modification time. #121

Closed m-appel closed 7 months ago

m-appel commented 7 months ago

Currently we provide the following reference fields with each crawler: https://github.com/InternetHealthReport/internet-yellow-pages/blob/f464e59a5065e850110d41df12632354739bba57/iyp/__init__.py#L615-L620

The reference_url is (most of the time) pointing to the exact data source, while the reference_time is just the timestamp of when the database dump was produced.

We want to add new fields and rename the existing ones like follows:

This change requires the update of some crawlers that modify reference_url dynamically.