HobnobMancer / cazy_webscraper

Web scraper to retrieve protein data catalogued by the CAZy, UniProt, NCBI, GTDB and PDB websites/databases.
https://hobnobmancer.github.io/cazy_webscraper/
MIT License
12 stars 3 forks source link

Add CAZy download log information to local database table #29

Closed widdowquinn closed 3 years ago

widdowquinn commented 3 years ago

The current model has log information describing the scraping actions that populate a database in the form of a "sidecar" log file. To ensure reproducibility, these need to be ported around with the database they refer to.

It may be more robust to add a table to the database that duplicates some of this data - such as date of download and command-lines - so that in isolation users can reconstruct how the data was collected.

HobnobMancer commented 3 years ago

Added on branch cazy_webscraper_development_and_enhancement