Closed LindsayYoung closed 10 years ago
I am getting another robots related error on the exim scraper.
Traceback (most recent call last):
File "inspectors/utils/utils.py", line 24, in run
run_method(cli_options)
File "inspectors/exim.py", line 15, in run
body = utils.download(page_url)
File "inspectors/utils/utils.py", line 84, in download
response = scraper.urlopen(url)
File "/projects/congress-api/.virtualenvs/inspectors/lib/python3.4/site-packages/scrapelib/__init__.py", line 390, in urlopen
resp = self.request(method, url, data=body, retry_on_404=retry_on_404, **kwargs)
File "/projects/congress-api/.virtualenvs/inspectors/lib/python3.4/site-packages/scrapelib/__init__.py", line 369, in request
headers=headers, **kwargs)
File "/projects/congress-api/.virtualenvs/inspectors/lib/python3.4/site-packages/scrapelib/__init__.py", line 173, in request
user_agent, url), url, user_agent)
scrapelib.RobotExclusionError: User-Agent 'unitedstates/inspectors-general (https://github.com/unitedstates/inspectors-general)' not allowed at 'http://www.exim.gov/oig/index.cfm'
Run pip install -r requirements.txt
-- I updated the version of scrapelib in https://github.com/unitedstates/inspectors-general/commit/31ca91df8b59721b8fea6235f03386b72a93158a#commitcomment-7143210, but it's actually a backwards-incompatible upgrade. If that doesn't fix it, please re-open and we'll figure it out.
And to be clear, it should upgrade scrapelib from 0.9.x
to 0.10.x
.
I am getting an error.