openstates / openstates-scrapers

source for Open States scrapers
https://openstates.org
GNU General Public License v3.0
845 stars 465 forks source link

MO failing since at least 2017-12-01 #1984

Closed openstates-bot closed 6 years ago

openstates-bot commented 6 years ago

MO has been failing since 2017-12-01

Based on automated runs it appears that MO has not run successfully in 2 days (2017-12-01).

  00:00:17 INFO pupa: save post 150 as post_124bbf2e-d726-11e7-9762-0242ac110006.json
00:00:17 INFO pupa: save post 151 as post_124bc046-d726-11e7-9762-0242ac110006.json
00:00:17 INFO pupa: save post 152 as post_124bc19a-d726-11e7-9762-0242ac110006.json
00:00:17 INFO pupa: save post 153 as post_124bc2ee-d726-11e7-9762-0242ac110006.json
00:00:17 INFO pupa: save post 154 as post_124bc44c-d726-11e7-9762-0242ac110006.json
00:00:17 INFO pupa: save post 156 as post_124bc686-d726-11e7-9762-0242ac110006.json
00:00:17 INFO pupa: save post 155 as post_124bc56e-d726-11e7-9762-0242ac110006.json
00:00:17 INFO pupa: save post 157 as post_124bc82a-d726-11e7-9762-0242ac110006.json
00:00:17 INFO pupa: save post 158 as post_124bc942-d726-11e7-9762-0242ac110006.json
00:00:17 INFO pupa: save post 159 as post_124bca82-d726-11e7-9762-0242ac110006.json
00:00:17 INFO pupa: save post 160 as post_124bcbae-d726-11e7-9762-0242ac110006.json
00:00:17 INFO pupa: save post 161 as post_124bccc6-d726-11e7-9762-0242ac110006.json
00:00:17 INFO pupa: save post 162 as post_124bcde8-d726-11e7-9762-0242ac110006.json
00:00:17 INFO pupa: save post 163 as post_124bcf3c-d726-11e7-9762-0242ac110006.json
00:00:17 INFO pupa: save organization Republican as organization_1270a438-d726-11e7-9762-0242ac110006.json
00:00:17 INFO pupa: save organization Democratic as organization_1270e920-d726-11e7-9762-0242ac110006.json
00:00:17 INFO pupa: Collecting subject tags from upper house.
00:00:17 INFO scrapelib: GET - http://www.senate.mo.gov/18info/BTS_Web/Keywords.aspx?SessionType=%s
00:00:17 INFO pupa: Collecting subject tags from lower house.
00:00:17 INFO scrapelib: GET - http://house.mo.gov/subjectindexlist.aspx?year=2018
mo (scrape, import)
no pupa_settings on path, using defaults
  bills: {}
  people: {}
  votes: {}
  committees: {}
    load_entry_point('pupa', 'console_scripts', 'pupa')()
  File "/opt/openstates/venv-pupa//bin/pupa", line 11, in <module>
Traceback (most recent call last):
  File "/opt/openstates/venv-pupa/src/pupa/pupa/cli/__main__.py", line 67, in main
    subcommands[args.subcommand].handle(args, other)
  File "/opt/openstates/venv-pupa/src/pupa/pupa/cli/commands/update.py", line 260, in handle
    return self.do_handle(args, other, juris)
  File "/opt/openstates/venv-pupa/src/pupa/pupa/cli/commands/update.py", line 305, in do_handle
    report['scrape'] = self.do_scrape(juris, args, scrapers)
  File "/opt/openstates/venv-pupa/src/pupa/pupa/cli/commands/update.py", line 172, in do_scrape
    fastmode=args.fastmode)
  File "/opt/openstates/openstates/openstates/mo/bills.py", line 35, in __init__
    self._scrape_subjects(self.latest_session())
  File "/opt/openstates/openstates/openstates/mo/bills.py", line 103, in _scrape_subjects
    self._scrape_house_subjects(session)
  File "/opt/openstates/openstates/openstates/mo/bills.py", line 271, in _scrape_house_subjects
    subject_page = self.lxmlize(subject_list_url)
  File "/opt/openstates/openstates/openstates/utils/lxmlize.py", line 24, in lxmlize
scrapelib.HTTPError: 404 while retrieving http://house.mo.gov/subjectindexlist.aspx?year=2018
  File "/opt/openstates/venv-pupa/lib/python3.5/site-packages/requests/sessions.py", line 521, in get
    return self.request('GET', url, **kwargs)
  File "/opt/openstates/venv-pupa/lib/python3.5/site-packages/scrapelib/__init__.py", line 292, in request
    raise HTTPError(resp)
    response = self.get(url)

Visit http://bobsled.openstates.org for more info.

agelimson commented 6 years ago

Will send a fix for that tomorrow

agelimson commented 6 years ago

The 2018 session is still not there, but on my computer everything seems to be running ok. Any idea why?

estaub commented 6 years ago

@agelimson when I try to browse to http://house.mo.gov/subjectindexlist.aspx?year=2018 , the URL shown as failing, I also get a 404 error.

agelimson commented 6 years ago

I’ve fixed that though in the latest version (which has already been incorporated I think)

On Thu, Dec 7, 2017 at 2:51 PM Ed Staub notifications@github.com wrote:

@agelimson https://github.com/agelimson when I try to browse to http://house.mo.gov/subjectindexlist.aspx?year=2018 , the URL shown as failing, I also get a 404 error.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/openstates/openstates/issues/1984#issuecomment-350090658, or mute the thread https://github.com/notifications/unsubscribe-auth/AZXqv3J_1oZmk0fDXSSIl1RnOrZwh3Vyks5s-E_egaJpZM4QzMeO .

estaub commented 6 years ago

@agelimson Got it! Go to http://bobsled.openstates.org, click on latest MO fail, which gets you to http://bobsled.openstates.org/run-MO-2017-12-07.html, you'll see that it's now failing for a different reason: multiple people with same name "Mike Cierpiot" in Jurisdiction - must provide birth_date to disambiguate

agelimson commented 6 years ago

Gah :(. It's actually the same person. He used to be a Rep, now he's a Senator

On Thu, Dec 7, 2017 at 3:30 PM, Ed Staub notifications@github.com wrote:

@agelimson https://github.com/agelimson Got it! Go to http://bobsled.openstates.org, click on latest MO fail, which gets you to http://bobsled.openstates.org/run-MO-2017-12-07.html, you'll see that it's now failing for a different reason: multiple people with same name "Mike Cierpiot" in Jurisdiction - must provide birth_date to disambiguate

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/openstates/openstates/issues/1984#issuecomment-350100124, or mute the thread https://github.com/notifications/unsubscribe-auth/AZXqvzC3-fU_Rr95T4r69VLUiREf_Cw1ks5s-FjlgaJpZM4QzMeO .

estaub commented 6 years ago

@agelimson Making any headway? I'm not sure I can do any more/better, but if you want me to look at it, holler.

agelimson commented 6 years ago

Sorry, I unfortunately won't be able to get to it until the new year :(

On Tue, Dec 12, 2017 at 10:35 AM, Ed Staub notifications@github.com wrote:

@agelimson https://github.com/agelimson Making any headway? I'm not sure I can do any more/better, but if you want me to look at it, holler.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/openstates/openstates/issues/1984#issuecomment-351107489, or mute the thread https://github.com/notifications/unsubscribe-auth/AZXqv9lJnEel1aBVP3RUBma88dQsf_4Hks5s_qs-gaJpZM4QzMeO .

estaub commented 6 years ago

Has anyone dealt (under pupa) with a representative becoming a senator within a session before? Here's an excerpt from bobcat:

import jurisdictions...
import organizations...
  bills: {}
Traceback (most recent call last):
  File "/opt/openstates/venv-pupa//bin/pupa", line 11, in 
    load_entry_point('pupa', 'console_scripts', 'pupa')()
  File "/opt/openstates/venv-pupa/src/pupa/pupa/cli/__main__.py", line 67, in main
    subcommands[args.subcommand].handle(args, other)
  File "/opt/openstates/venv-pupa/src/pupa/pupa/cli/commands/update.py", line 260, in handle
    return self.do_handle(args, other, juris)
  File "/opt/openstates/venv-pupa/src/pupa/pupa/cli/commands/update.py", line 307, in do_handle
    report['import'] = self.do_import(juris, args)
  File "/opt/openstates/venv-pupa/src/pupa/pupa/cli/commands/update.py", line 207, in do_import
    report.update(person_importer.import_directory(datadir))
  File "/opt/openstates/venv-pupa/src/pupa/pupa/importers/base.py", line 190, in import_directory
    return self.import_data(json_stream())
  File "/opt/openstates/venv-pupa/src/pupa/pupa/importers/base.py", line 226, in import_data
    for json_id, data in self._prepare_imports(data_items):
  File "/opt/openstates/venv-pupa/src/pupa/pupa/importers/people.py", line 33, in _prepare_imports
    raise SameNameError(name)
pupa.exceptions.SameNameError: multiple people with same name "Mike Cierpiot" in Jurisdiction - must provide birth_date to disambiguate

also, given this isn't the original problem, should it be forked onto a different issue and this closed?