Closed JReming85 closed 4 years ago
Currently There is no way to add a delay into the html body fetch. I have hacked php-curl into feed iron in the past by adding it into The Function at Line 271. That said I'm not 100% sure you could get the desired result from curl.
The other idea I had been working on, but have put on hold for the moment I mentioned #38. Adding the ability to call phantomjs of selenium. But these are potentially complex and will require significant re-works of the code-base to integrate. I might re-visit them when I can break configs in version 2
Expected Behavior
I am rewriting certain URLs to goto outline.com/https://website.com
However outline.com takes a few moments to clean it up and display the results. Is there anyway to halt the scrape until it finishes loading / bypassing paywalls, etc
Current Behavior
Scrapes the loading page
Steps to Reproduce
URL - https://www.wsj.com/articles/the-nfls-best-players-are-getting-richer-than-ever-1536163544
{ "type": "xpath", "xpath": [ "div[@class='article-wrapper']" ], "reformat": [ { "type": "regex", "pattern": "\/.+.com\/", "replace": "https:\/\/outline.com\/https:\/\/wsj.com" } ] }