webrecorder / pywb

Core Python Web Archiving Toolkit for replay and recording of web archives
https://pypi.python.org/pypi/pywb
GNU General Public License v3.0
1.34k stars 207 forks source link

User agent in the record mode #821

Open PedroG1515 opened 1 year ago

PedroG1515 commented 1 year ago

Is your feature request related to a problem? Please describe.

When Arquivo.pt uses pywb in record mode it gives the user of our service the possibility to record any information he needs.

However, as the number of users using the record mode increases, it may cause a significant increase in requests to a particular domain that may cause the Arquivo.pt IP to be blacklisted.

The second point is that Arquivo.pt is working on an internal service that uses pywb's record mode automatically to improve the data already in the archive. Since Arquivo.pt intends to use pywb's record mode on a large scale, it must have the same behaviors as a Crawler.

Describe the solution you'd like

Add in config.yaml a field to put the user agent when record mode is used.