sputnick-dev / saxon-lint

XPath3/XQuery 3.0/XSLT 2.0 cross-platform command line tool
38 stars 2 forks source link

saxon-lint

This program is aimed to query XML/(X)HTML files via command line such as XMLStarlet or xmllint, but with the ability to use XPath 3.0/XQuery 3.0/XSLT 2.0 (Other command-line tools are stuck with libxml2 (apart xidel and BaseX) and XPath 1.0/XSLT 1.0).

It can be considered as a simple wrapper around Saxon-HE and TagSoup java libs.

As far as you have the prerequisites, this project is cross-platform (Linux, MacOsX/*BSD, Windows... ).

The default XPath output displays each result nodes on a separate newline, this is suitable for shell scripting to split results in an array (by example). This feature is was missing with xmllint.

Main features

Limitations

Based on the Saxon Home Edition (HE) documentation, it supports the XQuery 3.1 Minimal Conformance. And it doesn't include the following:

This is not FOSS software.

For some FOSS tools that can update, check BaseX linked earlier.

Install prerequisites

And Perl modules :

With one command for Debian and derivatives :

 apt-get update && apt-get install openjdk-11-jre perl libxml2 libxml2-dev \
    libxml-libxml-perl libwww-perl liblwp-protocol-https-perl

Install:

$ git clone https://github.com/sputnick-dev/saxon-lint.git
$ cd saxon-lint
$ ./saxon-lint.pl --help

Usage:

Usage:
    saxon-lint.pl <opts> <file(s)>
    Parse the XML files and output the result of the parsing
    --help -h,                  this help
    --xpath,                    XPath expression
    --xquery,                   Xquery expression or file
    --html,                     use the HTML parser
    --xslt,                     use XSL transformation
    --output-separator,         set output separator to character ("\n", ","...)
    --indent,                   indent the output
    --no-pi,                    remove Processing Instruction (<?xml ...>)
    --saxon-opt,                Saxon extra argument
    --verbose -v,               verbose mode
    --version,                  current version

Examples:

saxon-lint.pl --xpath '//key[text()="String"]/following-sibling::string[1]' file.xml
saxon-lint.pl --xquery 'for $r in 1 to count(/table/tr) return /title' file.xml
saxon-lint.pl --indent --xquery file.xquery
curl -Ls 'http://domain.tld/file.xml' | saxon-lint.pl --xpath '//key[1]' -
saxon-lint.pl --xslt file.xsl file.xml
saxon-lint.pl --xquery file.xquery --saxon-opt -t --saxon-opt '!indent=yes'
saxon-lint.pl --html --xpath 'string-join(//a/@href, "\r\n")' http://x.y/z.html

Get shortened URL via tinyurl:

saxon-lint --html --xpath '//div[@class="indent"][1]/b/text()' \
    'http://tinyurl.com/create.php?url=http://google.com'

To set the string-join() character (like the latest snippet) for Unix likes, hit ctrl+v and ENTER. For Windows, just type "\r\n".

Check others examples.

For --saxon-opt, check Saxon documentation

Tricks:

To be able to run the command without dot-slash : ./saxon-lint, you need to modify the PATH variable. For windows, check http://www.computerhope.com/issues/ch000549.htm For Unix Likes, modify ~/.bashrc by searching PATH= and put PATH=$PATH:/PATH/TO/saxon-lint_DIRECTORY, then source ~/.bashrc

If you want to enable bash-completion, you have to install this program and move usr_share_bash-completion_completions_saxon-lint to /usr/share/bash-completion/completions/saxon-lint (or similar).

TroobleShooting:

Tested platforms :

Thanks to report any bug here.

Licensing:

This program is under the same licence as Saxon-HE.