City-Bureau / city-scrapers

Scrape, standardize and share public meetings from local government websites
https://cityscrapers.org
MIT License
332 stars 311 forks source link

Spiders in need of new descriptions (descriptions included) #275

Closed diaholliday closed 6 years ago

diaholliday commented 6 years ago

This would be a great first issue for non-coders—see Rebecca's video tutorial for instructions: https://www.youtube.com/watch?v=m_MjzgvVZ28

[Note: When creating your pull request with an updated description, please tag with the Label "Description" and reference Issue #275 in your pull request]

The following spiders need description edits. I've included new description text for each spider below:

Cook County Government - JAC Council Meeting (JAC): (@natashamathur is working) The Cook County Justice Advisory Council (JAC) is charged with the coordination and implementation of the Cook County President’s criminal and juvenile justice reform efforts and public safety policy development.

(Also change display name to Cook County Government - Justice Advisory Council (JAC)^):

Cook County Board of Commissioners County Commissioners are elected officials who oversee county activities and work to ensure that citizen concerns are met, federal and state requirements are fulfilled, and county operations run smoothly. The Cook County Board of Commissioners is the governing board and legislative body of the county. It is comprised of 17 Commissioners, each serving a four-year term and is elected from single member districts. Each district represents approximately 300,000 residents.

Chicago City Council Committee on Housing and Real Estate The City Council Committee on Housing and Real Estate hears proposed changes to the City of Chicago’s housing policies. Additionally, the committee hears real estate transactions between the City and other government entities, corporations and institutions.

Public Building Commission of Chicago The Public Building Commission of Chicago oversees and helps ensure quality facilities. Since its inception, the PBC has aimed to enhance education, safety and recreation across Chicago by building or renovating hundreds of schools, city colleges, libraries, parks, fire houses, police stations and other facilities

Chicago Transit Authority (Replace current text with the following) The governing arm of the CTA is the Chicago Transit Board. The Board consists of seven members, with four appointed by the Mayor of Chicago and three appointed by the Governor of Illinois. The Mayor's appointees are subject to the approval of the Governor and the Chicago City Council; the Governor's appointees are subject to the approval of the Mayor and the Illinois State Senate. The Board's separate committees include Human Resources, Strategic Planning, Capital Construction Oversight, and the Finance, Audit & Budget Committee.

Cook County Land Bank Authority The CCLBA acquires, holds, and transfers interest in real estate properties throughout Cook County to promote redevelopment and reuse of vacant, abandoned, foreclosed or tax-delinquent properties and support targeted efforts to stabilize neighborhoods. It was formed by ordinance of Cook County in 2013 to address the large inventory of vacant residential, industrial and commercial property in Cook County. The CCLBA is the largest land bank by geography in the country and is governed by a Board of Directors appointed by the Cook County Board of Commissioners.

Chicago Public Library The CPL Board of Directors oversees 80 library branches within the CPL system and upholds the CPL mission to "welcome and support all people in their enjoyment of reading and pursuit of lifelong learning" and to "strive to provide equal access to information, ideas and knowledge through books, programs and other resources." The CPL board does not meet in February, July and August.

Illinois Department of Public Health The IDPH is one of the state's oldest agencies. It currently operates headquarters in Springfield and Chicago, seven regional offices located around the state, three laboratories and 1,100 employees. The stated mission of the IDPH is to protect the health and wellness of the people of Illinois through the prevention, health promotion, regulation, and the control of disease and injury.

rebecca-burwei commented 6 years ago

:+1: @diaholliday just to clarify: do you want to replace the descriptions that are currently being scraped with these new ones, or just add this text to the current descriptions?

diaholliday commented 6 years ago

I'd like to replace all the current descriptions with the ones listed above.

rebecca-burwei commented 6 years ago

@ckwms63 did the description for the Cook County Health and Hospitals System :)

diaholliday commented 6 years ago

Thanks @ckwms63! Let me know if you're interested in taking any more of the descriptions edits on as well.

ckwms63 commented 6 years ago

You're welcome. I am interested in taking more descriptions edit. I plan to do it this Friday.

Best,

On Tue, Mar 27, 2018 at 8:51 AM, Darryl Holliday notifications@github.com wrote:

Thanks @ckwms63 https://github.com/ckwms63! Let me know if you're interested in taking any more of the descriptions edits on as well.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/City-Bureau/city-scrapers/issues/275#issuecomment-376532662, or mute the thread https://github.com/notifications/unsubscribe-auth/Aapql7-9WLSIw1pAWj2EeI49JlYv7u0_ks5tikPmgaJpZM4SxUJ5 .

rebecca-burwei commented 6 years ago

Hey @ckwms63 , can you let us know which ones you're working on? so we avoid multiple people working on the same thing

ckwms63 commented 6 years ago

Hi Rebecca,

Thank you for the email. I plan to work on the description edit tonight or tomorrow. I will let you know which one I will work on. Or better yet, you can recommend one for me to make sure that nobody else is working on it. I will also make my own note, a summary of your video to do it from the beginning to the end.

Best,

On Fri, Mar 30, 2018 at 1:17 PM, Rebecca Wei notifications@github.com wrote:

Hey @ckwms63 https://github.com/ckwms63 , can you let us know which ones you're working on? so we avoid multiple people working on the same thing

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/City-Bureau/city-scrapers/issues/275#issuecomment-377587856, or mute the thread https://github.com/notifications/unsubscribe-auth/Aapql-CKYio8qCPMvASNXbjp9MpVRD9gks5tjna_gaJpZM4SxUJ5 .

rebecca-burwei commented 6 years ago

Hey @ckwms63 , why don't you try Chicago Public Schools Board of Education :)

ckwms63 commented 6 years ago

Ok, will do.

On Fri, Mar 30, 2018 at 6:07 PM, Rebecca Wei notifications@github.com wrote:

Hey @ckwms63 https://github.com/ckwms63 , why don't you try Chicago Public Schools Board of Education :)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/City-Bureau/city-scrapers/issues/275#issuecomment-377642843, or mute the thread https://github.com/notifications/unsubscribe-auth/Aapql53mjaeimFYN69G5CRqqMrffymE0ks5tjrq3gaJpZM4SxUJ5 .

o-stovicek commented 6 years ago

I edited the description for the Illinois Labor Relations Board! (Pending my pull request being accepted of course)

cande313 commented 6 years ago

I am going to try Chicago City Council.

wildisle commented 6 years ago

Just took care of Chicago Police!

novellac commented 6 years ago

I'll look at City College of Chicago.

o-stovicek commented 6 years ago

The Regional Transit Authority spider is actually pulling a description from the website, but it's cut off: "The RTA Board of Directors typically meets each month on a Thursday at 175 W. Jackson Blvd, Suite 1650 in Chicago. Board committee meetings typically begin at 8:30 a.m. Agendas are posted at least 48 hours prior to the meetings. All RTA Board meetings are audio taped. Recording of meetings starting December 2014 are available on the" If we were to use that description we'd want it to pull the whole thing, but maybe we could alter the description to combine the static, "What is the RTA?" description that Darryl provided with the pulled description, if people think it would be helpful to have both. I think it's somewhat useful, but it depends on what we want out of the descriptions. (thanks Bonnie for the help figuring out what the spider was doing!)

bonfirefan commented 6 years ago

Tagging @novellac @wildisle @o-stovicek @ckwms63 @r-wei Wanted to note that in order for @diaholliday to keep track of which spiders have had updated descriptions, we'll tag any issues of pull requests with the Label "Description" and reference Issue #275, as @novellac has already done.

diaholliday commented 6 years ago

@o-stovicek the RTA spider doesn't need the info that's currently in the description. The one I provided above would be best (without anything from the old description).

o-stovicek commented 6 years ago

It's also not clear to me which spider is the Cook County Government - JAC Council Meeting (JAC) one... there's a Cook County Board of Commissioners one and a general Cook County Government one that appears to scrape lots of different meetings. Would we need to do something fancier to specifically provide that description for the JAC meetings within the Cook County Gov meetings? (It doesn't seem like we would want to change the display name if those meetings are just a subset and I'm not missing a Cook-County-JAC-specific spider.)

o-stovicek commented 6 years ago

@diaholliday Okay, good to know! Can't do that right now but I can get back to it later.

diaholliday commented 6 years ago

@o-stovicek yep, you're right. The JAC meetings are likely part of the Cook County Government spider. I'm not sure if we can attach descriptions to particular meetings (and not others) within a spider. @r-wei @bonfirefan or others in the group might know.

Thanks for pointing this out!

bonfirefan commented 6 years ago

Is JAC the only one left of these descriptions @diaholliday ? We can write in a conditional to attach the JAC description to specific cook county government spider items.

natashamathur commented 6 years ago

I'm working on the JAC descriptions!

diaholliday commented 6 years ago

I added the Cook County Board of Commissioners to this list (with new description).

diaholliday commented 6 years ago

I added two new descriptions:

1) Chicago City Council Committee on Housing and Real Estate 2) Public Building Commission of Chicago

diaholliday commented 6 years ago

I added four new descriptions:

  1. Chicago Transit Authority
  2. Cook County Land Bank Authority
  3. Chicago Public Library
  4. Illinois Department of Public Health
bonfirefan commented 6 years ago

@o-stovicek see if you can take 2. Cook County Land Authority

bonfirefan commented 6 years ago

@diaholliday, @pjsier and I were discussing the descriptions portion, and he noted that the CTA spider for example scrapes event-specific description information, as does the il_pubhealth and likely other spiders.

For example:

Description: 3 day seminar from 8:00 am to 4:30 pm each day Crowne Plaza Hotels and Resorts 3000 North Dirksen Parkway, Springfield, IL Interested persons may contact the Division of Life Safety and Construction at 217-785-4247

Should we be careful about over-writing event descriptions with agency descriptions and assign a new agency description field for each spider in the item pipeline? The information seems fairly important - when available, that is.

diaholliday commented 6 years ago

Yes, that's a good question, I've also gone back and forth on this. So, on one hand, the event specific descriptions are valuable and it would be ideal if we had a rewritten description for each specific event (when the original description isn't comprehensive enough) but, on the other hand, some of those smaller events within larger spiders aren't useful in the first place. Impactful decisions aren't going to be made at the "3 day seminar from 8:00 am to 4:00 pm" so we would never send a Documenter there.

Which brings me to the real question: should the aggregator collect everything possible from all of these orgs or should it collect things based on certain criteria, i.e. a committee/board meeting, a meeting where governmental decisions are made, meetings in which there will be public comment periods, etc. Those are the meetings that are most important to the most people, but they're often lumped in with agency meetings that are non-specific, niche or just not entirely useful for the broader public.

Any thoughts on this? I haven't come to a conclusion so I'd be happy to bounce the idea around a bit. Especially if there are tech needs or complications to consider.

bonfirefan commented 6 years ago

So actually this would be a good argument for keeping the event description so it's possible to discern which of the events are less important and adding a separate field in the pipeline for agency descriptions. What we're thinking is actually adding in the agency description as a property variable, the same way we define the spider agency "long name." For example:

class ChiTransitSpider(Spider):
    name = 'chi_transit'
    long_name = 'Chicago Transit Authority'
    allowed_domains = ['www.transitchicago.com']
    base_url = 'http://www.transitchicago.com'
    start_urls = ['https://www.transitchicago.com/board/notices-agendas-minutes/']
    agency_description = "The governing arm of the CTA is the Chicago Transit Board. The Board consists of seven members, with four appointed by the Mayor of Chicago and three appointed by the Governor of Illinois. The Mayor's appointees are subject to the approval of the Governor and the Chicago City Council; the Governor's appointees are subject to the approval of the Mayor and the Illinois State Senate. The Board's separate committees include Human Resources, Strategic Planning, Capital Construction Oversight, and the Finance, Audit & Budget Committee."

As to deciding whether an event should be kept, will the classifications #319 help the filtering process? It seems like we want to err on the side of keeping potentially useful events but make it easier in review to take out the non-essential meetings.

diaholliday commented 6 years ago

@bonfirefan this all sounds spot on. I trust your judgement if y’all think this is the best technical route. We do want to keep event-specific descriptions that may be useful while creating an agency-wide description for each spider but we definitely want to be able to remove non-essential meetings. And yes, I think the new classifications doc could help in this process, especially in terms of sorting out and displaying board/committee meetings, etc. We may need to add a classification that is something like “non-essential” or something similar to indicate that it doesn’t meet our display-worthy criteria.

diaholliday commented 6 years ago

Actually it looks like the “forum” classification will cover the majority of these non-governance meetings that are part of governance agencies, i.e. not display worthy but good to collect anyway

rebecca-burwei commented 6 years ago

Closing this in favor of #380