@withtwoemms while we're getting spiders sent up to scrape the data, I wonder if it might make sense to have your snapshot code be running on a regular basis and storing that data somewhere (either locally or somewhere like S3)?
I think this would be most important for the sites that update daily, which I'm assuming means we lose a bit of an archive of who is there over time (which, to be honest, I'm guessing is part of CCBF's intent to analyze the data).
@withtwoemms while we're getting spiders sent up to scrape the data, I wonder if it might make sense to have your snapshot code be running on a regular basis and storing that data somewhere (either locally or somewhere like S3)?
I think this would be most important for the sites that update daily, which I'm assuming means we lose a bit of an archive of who is there over time (which, to be honest, I'm guessing is part of CCBF's intent to analyze the data).