benibela / xidel

Command line tool to download and extract data from HTML/XML pages or JSON-APIs, using CSS, XPath 3.0, XQuery 3.0, JSONiq or pattern matching. It can also create new or transformed XML/HTML/JSON documents.
http://www.videlibri.de/xidel.html
GNU General Public License v3.0
674 stars 42 forks source link

any more examples for xquery? #67

Open ralyodio opened 3 years ago

ralyodio commented 3 years ago

I can't find much online. I'm trying to parse html and convert to json with some massaging.

Reino17 commented 3 years ago

May I suggest you have a look at:

The last url is me. Comments are in Dutch, but maybe you'll find them useful nonetheless.

P.s. Since you already know Stack Overflow, a response would help.

ralyodio commented 3 years ago

For some reason I'm not getting the <ticker> tag in the output, just the raw values. Am I misunderstanding how this should work?

#!/usr/bin/env sh

xidel -s https://www.marketbeat.com/short-interest/ --xquery '
for $row in //tbody/tr
    let $ticker := $row//td[1]//div[@class="ticker-area"]
    return <ticker>{$ticker}</ticker>
'
Reino17 commented 3 years ago

You're probably looking for:

xidel -s https://www.marketbeat.com/short-interest --xquery '
  for $ticker in //tbody/tr/td[1]//div[@class="ticker-area"]/text() return
  <ticker>{$ticker}</ticker>
' --output-format=xml

Or perhaps:

xidel -s https://www.marketbeat.com/short-interest --xquery '
  serialize(
    <xml>{
      for $ticker in //tbody/tr/td[1]//div[@class="ticker-area"]/text() return
      <ticker>{$ticker}</ticker>
    }</xml>,
    {"indent":true(),"omit-xml-declaration":false()}
  )
'
ralyodio commented 3 years ago

ahh ok, that works.