Open benoit74 opened 1 week ago
That being said, I'm not sure this is really straightforward to implement.
Scraper should pass its name and version to scraperlib so that we set properly the header
And we also need a contact, which is probably more related to who ran the scraper
Not sure this is so easy to implement in the end.
For files hosted on upload.wikimedia.org, we must comply with their User-Agent policy at https://meta.wikimedia.org/wiki/User-Agent_policy
Doing so at scraperlib level in
stream_file
(main methods using in many scraper to download files / assets) would help avoid having to do so in every scraper (and forget about it over and over).