crwlrsoft / crawler

Library for Rapid (Web) Crawler and Scraper Development
https://www.crwlr.software/packages/crawler
MIT License
325 stars 11 forks source link

JSON step improvement #139

Closed otsch closed 7 months ago

otsch commented 7 months ago

Allow getting the whole decoded JSON as array with the new Json::all() and also allow to get the whole decoded JSON, when using Json::get(), inside a mapping using either empty string or * as target. Example: Json::get(['all' => '*']). * only works, when there is no key * in the decoded data.

Make it work with responses loaded by a headless browser. If decoding the input string fails, it now checks if it could be HTML. If that's the case, it extracts the text content of the <body> and tries to decode this instead.