The current model has log information describing the scraping actions that populate a database in the form of a "sidecar" log file. To ensure reproducibility, these need to be ported around with the database they refer to.
It may be more robust to add a table to the database that duplicates some of this data - such as date of download and command-lines - so that in isolation users can reconstruct how the data was collected.
The current model has log information describing the scraping actions that populate a database in the form of a "sidecar" log file. To ensure reproducibility, these need to be ported around with the database they refer to.
It may be more robust to add a table to the database that duplicates some of this data - such as date of download and command-lines - so that in isolation users can reconstruct how the data was collected.