Closed nfmcclure closed 1 year ago
Thanks for the submission @nfmcclure! I'm testing this now but stuck:
(venv) $ scrapy crawl sitemapspider
Scrapy 2.7.1 - no active project
Unknown command: crawl
Anything else I should do after setting up the venv?
Yup sorry- two changes needed,
scrapy
to requirements.txt.sitemap
before running scrapy crawl sitemapspider
.Let me know if that helps.
Edit- ok re-tested in a completely new venv. Fixed some requirements.
This is weird, but still not working! I can try more intensively later 🤷
That is strange! I haven't run into that before. But it seems that can happen, as tackled here:
@nfmcclure cool—from the venv, uninstalling and reinstalling xmltodict
(which wasn't installed outside the venv) worked. I was able to get this to run! I'll wait for @thejqs to approve but LGTM
I used this for some Southeast Arkansas agencies with mixed success:
attempt 1: sample_host_sites.txt 20230131_122648_output.csv
attempt 2: sample_host_sites.txt 20230131_123505_output.csv
Added files, readme, and sample URL list for sitemap scraper.