City-Bureau / city-scrapers

Scrape, standardize and share public meetings from local government websites
https://cityscrapers.org
MIT License
334 stars 311 forks source link

Add wayne_land_bank spider #885

Closed radoslawkrolikowski closed 5 years ago

radoslawkrolikowski commented 5 years ago

This pull request adds the wayne_land_bank spider, the test file of that spider and target web page as an HTML file.

radoslawkrolikowski commented 5 years ago

All changes that you suggested have been made. I am only wondering about the location item. I used default location name and address because it looks clearer than this one on the website. After adding _validate_location method, maybe it will be good to use scraped location if default wasn't found.

loc = re.findall('The Board of Directors holds meetings at (.*?)(?=\\n)', self.response_data[0])[0]
print('Name: {}, Address: {}'.format(loc.split('at')[0].title(), loc.split('at')[1]))

Or raising an error as it is right now is better option?

pjsier commented 5 years ago

@radoslawkrolikowski thanks for these changes! For now, since the markup is so irregular I think it's safest to throw the error if the location isn't found. Reviewing now but I think these are good to go