Use case
I have a site with nice sitemap.xml file.
I can use goskyr to extract a wonderful json of all the pages i want to crawl.
But now I cannot pass this information back to the tool.
I know that I might create a config.yml with all the URLs but that is not friendly and I suspect that creating thousands of scrapers would kill the system
As a different functionality maybe i would consider to read the yml file with pool of scrapers so that executing an yml with 5000 scrapers is scalable.
Use case I have a site with nice sitemap.xml file. I can use goskyr to extract a wonderful json of all the pages i want to crawl. But now I cannot pass this information back to the tool.
I know that I might create a config.yml with all the URLs but that is not friendly and I suspect that creating thousands of scrapers would kill the system
As a different functionality maybe i would consider to read the yml file with pool of scrapers so that executing an yml with 5000 scrapers is scalable.
I will experiment but i'm no go expert