In the Metro API, some bills are public, and others are private. If the scrape includes private bills, it breaks. We should add logic to handle this, perhaps with a flag that permits private bills to be skipped so we don't inadvertently skip them in deployed environments.
Traceback (most recent call last):
File "/usr/local/bin/pupa", line 8, in <module>
sys.exit(main())
File "/usr/local/lib/python3.6/site-packages/pupa/cli/__main__.py", line 68, in main
subcommands[args.subcommand].handle(args, other)
File "/usr/local/lib/python3.6/site-packages/pupa/cli/commands/update.py", line 278, in handle
return self.do_handle(args, other, juris)
File "/usr/local/lib/python3.6/site-packages/pupa/cli/commands/update.py", line 327, in do_handle
report['scrape'] = self.do_scrape(juris, args, scrapers)
File "/usr/local/lib/python3.6/site-packages/pupa/cli/commands/update.py", line 175, in do_scrape
report[scraper_name] = scraper.do_scrape(**scrape_args)
File "/usr/local/lib/python3.6/site-packages/pupa/scrape/base.py", line 114, in do_scrape
for obj in self.scrape(**kwargs) or []:
File "/app/lametro/bills.py", line 202, in scrape
for matter in matters:
File "/app/lametro/bills.py", line 170, in matters
yield self.matter(matter_id)
File "/usr/local/lib/python3.6/site-packages/legistar/bills.py", line 289, in matter
matter = self.endpoint('/matters/{}', matter_id)
File "/usr/local/lib/python3.6/site-packages/legistar/bills.py", line 309, in endpoint
response = self.get(url.format(*args))
File "/usr/local/lib/python3.6/site-packages/requests/sessions.py", line 546, in get
return self.request('GET', url, **kwargs)
File "/usr/local/lib/python3.6/site-packages/scrapelib/__init__.py", line 292, in request
raise HTTPError(resp)
scrapelib.HTTPError: 404 while retrieving https://webapi.legistar.com/v1/metro/matters/8085
In the Metro API, some bills are public, and others are private. If the scrape includes private bills, it breaks. We should add logic to handle this, perhaps with a flag that permits private bills to be skipped so we don't inadvertently skip them in deployed environments.