Right now the VA events scraper does scrape events for both the House and Senate in Virginia successfully. However, the implementation for the Senate (upper chamber - scrape_upper()) is not very robust.
The goal here is to improve the logic to add:
Better handling of the location field. It looks like locations for events can generally be parsed out of the "Description" column on the primary Senate event source page. It seems like it is generally everything after the semicolon
Process additional pages to be able to add agenda (event.add_agenda_item()) items to the event, and to add bill (agenda.add_bill()) items to the agenda. Many of the items in the above-linked table include a link to "committee info" or "sub-committee" info, and at least some of the pages at those urls have links to an agenda or "docket" that describes which bills will be discussed by the committee on that day.
Right now the VA events scraper does scrape events for both the House and Senate in Virginia successfully. However, the implementation for the Senate (upper chamber -
scrape_upper()
) is not very robust.The goal here is to improve the logic to add:
location
field. It looks like locations for events can generally be parsed out of the "Description" column on the primary Senate event source page. It seems like it is generally everything after the semicolonagenda
(event.add_agenda_item()
) items to theevent
, and to addbill
(agenda.add_bill()
) items to theagenda
. Many of the items in the above-linked table include a link to "committee info" or "sub-committee" info, and at least some of the pages at those urls have links to an agenda or "docket" that describes which bills will be discussed by the committee on that day.Here's an example: