basically scrape phrack archive as an example and turn the articles into tagged entries;
Maybe create a script to parse the links and then dump each set into it?
Solution:
Crawling+ archiving all content under said URL (with configurable top-level URL, so you can scrape google.com vs google.com/blog/ - this should handle the phrack usecase....)
Option to use X tool to generate a sitemap(also with configurable top-level URL) , and then crawl+archive all content noted in the site map
ezpz, just do depth crawling limit of say X (user-configurable) and say all of that text is one entry, have separations/page names indicating demarcation. Use tags for saying its a collection of X site.
similar to above but unique identification of articles.
Generate and use a site map.
So, will offer two new options:
sitemap creation + archiving all content under said sitemap (with configurable top-level URL, so you can scrape google.com vs google.com/blog/ - this should handle the phrack usecase....)
Option to use X tool to generate a sitemap, and then crawl+archive all content noted in the site map
basically scrape phrack archive as an example and turn the articles into tagged entries; Maybe create a script to parse the links and then dump each set into it?
Solution: