[sba] Handle landing page errors more gracefully, properly respect year range

The sba scraper wasn't obeying year range if the published_on timestamp wasn't found early on. There's a way in the scraper to hardcode publication dates or find them through other means, but by the time the scraper had gotten there it no longer bothered respecting the year_range. I've fixed that, which will make the scraper more efficient for regular running.

I'm also getting an error when fetching a particular landing page -

Traceback (most recent call last):

  File "inspectors/utils/utils.py", line 27, in run
    return run_method(cli_options)

  File "inspectors/sba.py", line 68, in run
    report = report_from(result, year_range)

  File "inspectors/sba.py", line 117, in report_from
    landing_page = BeautifulSoup(landing_body)

  File "/home/unitedstates/.virtualenvs/inspectors/lib/python3.4/site-packages/bs4/__init__.py", line 162, in __init__
    elif len(markup) <= 256:

TypeError: object of type 'NoneType' has no len()

That's from fetching this page, which gets linked at the date-less entry in this screenshot. There are no permalinks -- I found this by searching for the keyword "originating" and looking at the bottom of page 4 of results.

okay-bad-sba

The new behavior throws a proper exception, but I'm not sure how to handle this. The SBA site is throwing a 500, so I'll report it to the IG. But I don't want the scraper to just skip it. I'll punt on that for a bit, after notifying SBA.

unitedstates / inspectors-general

[sba] Handle landing page errors more gracefully, properly respect year range #174