Closed cldellow closed 1 year ago
// A set of domains whose sitemaps will be discovered via // robots.txt, and used to discover URLs (optional) "seed-sitemaps": ["news.ycombinator.com"]
Needs https://github.com/cldellow/datasette-scraper#get_seed_urlsscraper-config, https://github.com/cldellow/datasette-scraper#discover_urlsscraper-config-url-response
Returns array of sitemap URLs, but also knows how to discover new URLs from the sitemaps
Needs https://github.com/cldellow/datasette-scraper#get_seed_urlsscraper-config, https://github.com/cldellow/datasette-scraper#discover_urlsscraper-config-url-response
Returns array of sitemap URLs, but also knows how to discover new URLs from the sitemaps