benibela / xidel

Command line tool to download and extract data from HTML/XML pages or JSON-APIs, using CSS, XPath 3.0, XQuery 3.0, JSONiq or pattern matching. It can also create new or transformed XML/HTML/JSON documents.
http://www.videlibri.de/xidel.html
GNU General Public License v3.0
674 stars 42 forks source link

Replacing (regex) the end ("$") and beginning ("^") of a string does not work without any additional character. #74

Closed Baltazar500 closed 3 years ago

Baltazar500 commented 3 years ago

Replacing (regex) the end ("$") and beginning ("^") of a string does not work without any additional character.

Work :

replace(./td[@class='beta'], '^.', 'S0' replace(./span, '.$', '')

Not work :

replace(./td[@class='beta'], '^', 'S0' replace(./span, '$', '')

:(

benibela commented 3 years ago

You cannot use a regex that matches the empty string there

The W3C forbad it

Baltazar500 commented 3 years ago

But scintilla viewers and posix/nix utilities allow it.

echo 123|sed -r 's/(^|$)/\"/g' "123"

Is it possible another solution to (regex) replace text ?

benibela commented 3 years ago

But the W3C had their own ideas

You can suggest it here for XPath 4

Reino17 commented 3 years ago

@Baltazar500

$ printf 123 | xidel -se 'replace($raw,"(.+)","""$1""")'
"123"

I'm not a fan of RegEx however. Especially if it's not needed at all, when all you want to do is just add some characters.

$ printf 123 | xidel -se 'x:cps(34)||$raw||x:cps(34)'
$ printf 123 | xidel -se 'concat(""",$raw,""")'
$ printf 123 | xidel -se 'x"""{$raw}"""'
"123"
Baltazar500 commented 3 years ago

@Reino17, Thanks for the examples.

When using a large number of replace expressions, matches, extract, etc this will complicate things even more :(

Is it possible in "replace(...)" to use \1 \2 as in sed when replacing via regex ?

echo 123|sed -r 's/^(.)/aaa\1/g' aaa123

Reino17 commented 3 years ago
$ printf 123 | xidel -se 'replace($raw,"(.+)","aaa$1")'
aaa123

In my previous post you could've already seen the $1. But why use RegEx when you could simply do $ printf 123 | xidel -se '"aaa"||$raw'? You're not "replacing" anything. You're just adding a string.

Baltazar500 commented 3 years ago

@Reino17, Thank you. I'm sorry. I missed this example. Where were my eyes >_<