Closed rhetr closed 2 years ago
spider.parse yields requests so extra steps need to be taken to yield meetings. first create the additional requests folder:
mkdir tests/files/dekalb_county_boc_requests
then collect a list of the request.urls:
for request in meeting_requests:
print(request.url)
copy and save this into a file requests.txt
, then run the following in bash:
cat requests.txt | while read a; do scrapy fetch --nolog "$a" > tests/files/dekalb_county_boc_requests/$(basename $a).html; done
once the file_response
s are created, the relevant Request
can be associated by response.request = request
(see code for details). gonna try the same technique for atl_boe soon
Summary
Issue: #17
Checklist