benibela / xidel

Command line tool to download and extract data from HTML/XML pages or JSON-APIs, using CSS, XPath 3.0, XQuery 3.0, JSONiq or pattern matching. It can also create new or transformed XML/HTML/JSON documents.
http://www.videlibri.de/xidel.html
GNU General Public License v3.0
674 stars 42 forks source link

Doc issue on x-extract() #101

Open sputnick-dev opened 1 year ago

sputnick-dev commented 1 year ago

As far as we can use PCRE regex in x-extract() (very nice trick), you should rewrite the doc:

In https://www.benibela.de/documentation/internettools/xpath-functions.html#x-extract

s/grep -oE/grep -oP

By the way, thanks for your tool, it's a must have in the toolchest, I use it every day ;)

benibela commented 1 year ago

With "basically" I mean it is similar, but not the same.

It does not support full PCRE. The syntax is from the https://www.w3.org/TR/xpath-functions-30/#regex-syntax

And the implementation does not do backtracking, so even some of those regexps cannot be evaluated

sputnick-dev commented 1 year ago

Maybe add an explanations that it's a subset of Perl's regex(?) allowing \d|\w|\s...