The unusual thing about this is that a given report can be seen on multiple of the pages we scrape. On each page, there are different granularities of data. We scrape pages from most granular to least granular and skip report ids that we've already seen. This relies on us being very confident in our report id uniqueness, but I feel pretty good about it. I'm certainly open to better solutions though.
I had glanced at this one and thought it looked very tricky, given the spread out dates. Thanks for all your hard work on it, this looks great, @spulec!
The unusual thing about this is that a given report can be seen on multiple of the pages we scrape. On each page, there are different granularities of data. We scrape pages from most granular to least granular and skip report ids that we've already seen. This relies on us being very confident in our report id uniqueness, but I feel pretty good about it. I'm certainly open to better solutions though.