City-Bureau / city-scrapers-atl

City Scrapers for Atlanta
MIT License
1 stars 0 forks source link

dekalb county boc spider #139

Closed rhetr closed 2 years ago

rhetr commented 2 years ago

Summary

Issue: #17

Checklist

rhetr commented 2 years ago

spider.parse yields requests so extra steps need to be taken to yield meetings. first create the additional requests folder:

mkdir tests/files/dekalb_county_boc_requests

then collect a list of the request.urls:

for request in meeting_requests:
    print(request.url)

copy and save this into a file requests.txt, then run the following in bash:

cat requests.txt | while read a; do scrapy fetch --nolog "$a" > tests/files/dekalb_county_boc_requests/$(basename $a).html; done

once the file_responses are created, the relevant Request can be associated by response.request = request (see code for details). gonna try the same technique for atl_boe soon