benibela / xidel

Command line tool to download and extract data from HTML/XML pages or JSON-APIs, using CSS, XPath 3.0, XQuery 3.0, JSONiq or pattern matching. It can also create new or transformed XML/HTML/JSON documents.
http://www.videlibri.de/xidel.html
GNU General Public License v3.0
681 stars 42 forks source link

Prepend baseurl before following #9

Closed Fuzzyma closed 8 years ago

Fuzzyma commented 8 years ago

Is there a way to prepend the baseurl before every link before following? In my case I have relative links in a json file which I want to follow. I need to add the url to the links before I can follow them.

Is this even possible? Thats my command so far:

xidel file.json -f '$json()["url"]' -e '//html'
Fuzzyma commented 8 years ago

Yes it is possible with transform and concat

xidel file.json -f 'transform($json()("url"), function($e){ concat("http://baseUrl.com", $e) })' -e ...
benibela commented 8 years ago

That's a surprise. I thought transform would only work with XML.

It is supposed to be done with the mapping operator !

xidel file.json -f '$json()("url") ! concat("http://baseUrl.com", .) ' -e ...

or

xidel file.json -f '$json()("url") ! resolve-uri(., "http://baseUrl.com") ' -e ...
Fuzzyma commented 8 years ago

Well that solution looks way better. However: I am already done with my task :D. Thanks anyway! Is this documented somewhere?

Also: Is there a way to retry requests, when they timeout? Because I get errors like this: Error: -3 Internet connection reseted

benibela commented 8 years ago

Is this documented somewhere?

It is just XPath 3: https://www.w3.org/TR/xquery-30/#id-map-operator

Also: Is there a way to retry requests, when they timeout?

--error-handling=xx=retry

for an error code xx. Or xx for all 2 digit codes

Fuzzyma commented 8 years ago

cool thanks!