Closed fgregg closed 4 years ago
Have you found a changelog anywhere (or does Granicus never even release one?)
I haven't.
It appears that there are still some resources that are still only available from the InSIte web pages; for example, Audio Links in LA Metro.
However, there seems to always be a link to the InSite url in the API entries. So, we could get rid of _scrapeWebCalendar
and instead do something like this in
def events(...):
...
for api_event in self.api_events(since_datetime):
...
web_event = self.web_scraper.event_detail(api_event['EventInSiteURL'])
That is, confidently visit the InSite detail page for an event only when we need to.
This should give us a more confidence than the approach of trying to connect events scraped in two ways. It should also make for faster scrapes as we'll visit fewer pages overall.
Done in #93.
The legistar webapi has some new features that may obviate the need for so much scraping of the web.
So far, I've noticed that event endpoints have a comment field and a link to the event web url.