benibela / xidel

Command line tool to download and extract data from HTML/XML pages or JSON-APIs, using CSS, XPath 3.0, XQuery 3.0, JSONiq or pattern matching. It can also create new or transformed XML/HTML/JSON documents.
http://www.videlibri.de/xidel.html
GNU General Public License v3.0
674 stars 42 forks source link

[XPath] How to get both items? #106

Closed Shohreh closed 1 year ago

Shohreh commented 1 year ago

Hello,

I'm using Xidel for this, but it's actually an XPath question… that I don't where to ask.

I need to grab "FOO" and "BAR" from this div:

<div class="panel-heading collapsed" role="tab">
  <div class="col-xs-1 h_class">
    <label class="label label-warning">FOO</label>
  </div>
  <div class="col-xs-9 name_station">
    BAR
  </div>
</div>

The following works to get FOO, but I haven't found how to get BAR. Anybody knows?

xidel.exe -se "//div[@class='panel-heading collapsed']/div/label/text()" test.html

Thank you.

PS: If you know of a good site/book to learn XPath, I'm interested

Reino17 commented 1 year ago

Create a sequence:

xidel -s test.html -e "//div[@class='panel-heading collapsed']/(div/label,div[2])"
FOO

    BAR

"BAR" stripped of white-space:

xidel -s test.html -e "//div[@class='panel-heading collapsed'] ! (div/label,normalize-space(div[2]))"
FOO
BAR

xidel -s test.html -e ^"^
  //div[@class='panel-heading collapsed'] ! (^
    div/label,^
    normalize-space(div[2])^
  )^
"
FOO
BAR

but it's actually an XPath question… that I don't where to ask.

I think the forums, the mailing list, or StackOverflow would be a better place to ask.

PS: If you know of a good site/book to learn XPath, I'm interested

https://github.com/benibela/xidel/issues/67#issuecomment-770084663

And some general XPath/XQuery urls:

Shohreh commented 1 year ago

Thanks!