XSLT vs XQuery - Githubissues

Doing the data transformation for our project's social network graph made me think of how easy it is for me to think in XSLT vs XQuery. For example, my first thought to write the transformation was to do it in XQuery because I could visualize the code that way first, but Zac pointed out that the code would probably be simpler to write in XSLT, and that's how I ended up doing the transformation. Which do you find easier to think in?

@JosephDRogers23 I’m more comfortable in XSLT than in XQuery, but I’ve been doing it longer, and I think comfort with XSLT grows as one becomes more familiar with the idioms. That’s true of any method, but perhaps more so with XSLT because the declarative model is unlike that of many other programming languages. On the other hand, XQuery is often easier for people who come from a database background, since it borrows some ideas from SQL.

Within this course, we’ve dithered over the years about whether to teach XSLT or XQuery first. In favor of teaching XQuery first, the XQuery idiom is closer than XSLT to XPath, so there’s sort of a natural progression from XPath to XQuery to XSLT. In favor of teaching XSLT first, XSLT is so odd that having extra time to get used to it may be important. After trying both orders, we settled on the one we use now: XPath, XSLT, and then XQuery.

XSLT and XQuery overlap to such an extent in their capabilities that some people use only one or only the other, and before there was XQuery, we all did everything in XSLT. Nowadays I generally use XQuery for ... well ... querying, and I use XSLT within XQuery for transformation. For example, I might pluck some information out of eXist with its original XML markup, using XQuery to select and arrange what I need, and then pipe it into an XSLT transformation to retag it as HTML before passing it along as output.

@JosephDRogers23 I find it easier to think in XQuery for things like extracting data for a network graph, and even for drawing SVG. There's something to be said for greater efficiency in writing code (e.g. fewer angle brackets and template matches, etc.) But there's much that's quite the same in both when we work with declaring global variables, for example.

In the Greensburg pair of DH courses, I have one fall class constructed that is entirely focused on XSLT (no XQuery), and the other spring course designed to emphasize XQuery, with just a brief unit on XSLT because there are things it's well suited for that XQuery isn't. XSLT is best for things like constructing a reading a view of your documents, for example, for anything that we associate with "push" processing (where we don't define conditions for every single XPath location, but instead push elements to be processed wherever they are in the document). XQuery seems best for anything to do with data extraction (such as pulling a TSV file for use in a network graph). (I'm afraid I don't agree with @zme1 here that XSLT is easier for that, but one's perception of "ease" probably has to do with the idiom one is most accustomed to for thinking.)

I have long favored XSLT, which I found I was able to get a handle on after committing a day or two to experimentation to really understand what was going on. I struggled much more with XQuery, which I find I need to do more trial and error with for any given task than XSLT. This problem is, of course, getting worse and worse as I almost always use XSLT for any given task and therefore have to refamiliarize myself with XQuery every time I try to do anything with it.

Over the years, I've felt that XQuery has been more popular than I expected among students in the course. It really boils down to personal preference and what ends up clicking for you.

On Sat, Apr 14, 2018, 3:03 PM Elisa Beshero-Bondar notifications@github.com wrote:

@JosephDRogers23 https://github.com/JosephDRogers23 I find it easier to think in XQuery for things like extracting data for a network graph, and even for drawing SVG. There's something to be said for greater efficiency in writing code (e.g. fewer angle brackets and template matches, etc.) But there's much that's quite the same in both when we work with declaring global variables, for example.

In the Greensburg pair of DH courses, I have one fall class constructed that is entirely focused on XSLT (no XQuery), and the other spring course designed to emphasize XQuery, with just a brief unit on XSLT because there are things it's well suited for that XQuery isn't. XSLT is best for things like constructing a reading a view of your documents, for example, for anything that we associate with "push" processing (where we don't define conditions for every single XPath location, but instead push elements to be processed wherever they are in the document). XQuery seems best for anything to do with data extraction (such as pulling a TSV file for use in a network graph). (I'm afraid I don't agree with @zme1 https://github.com/zme1 here that XSLT is easier for that.)

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/obdurodon/dh_course/issues/70#issuecomment-381363770, or mute the thread https://github.com/notifications/unsubscribe-auth/AQsxJREgPRFWPkQeIbJKJ9MK-0ExyM9Bks5tonIfgaJpZM4TVNvz .

@JosephDRogers23 As you might imagine, opinions and preferences often diverge more than the languages themselves. I just read two postings from XQuery partisans: https://en.wikibooks.org/wiki/XQuery/Typeswitch_Transformations and https://en.wikibooks.org/wiki/XQuery/Benefits, and the arguments, while accurate in their details, draw inappropriately general conclusions from individual experience. For example, XQuery syntax is certainly easier for an SQL programmer to learn than XSLT, but the fact that XSLT is written in XML syntax isn’t a problem for those who are comfortable with XML syntax. Additionally, we sometimes write XSLT to create XSLT, and since XSLT excels at outputting XML syntax, it turns out to be an advantage that XSLT is expressed in XML.

The typeswitch() expression in XQuery, which is the focus of the two links above, is a way of mimicking XSLT templates in XQuery, which is to say that it is not especially familiar to SQL programmers. And while switch() operations may be familiar to programmers in many languages, typeswitch() requires a different level of abstraction. Learning to use typeswitch() isn’t difficult, but it isn’t consistent to argue that XQuery is likely to be more familiar than XSLT to those who are new to both, which is true in many ways, while understating how unusual typeswitch() is, given its singular importance in using XQuery as an XSLT replacement. Ultimately, XSLT is declarative in its bones, while XQuery’s reliance on FLWOR as its principal method of flow control is fundamentally imperative. And just as one can write declarative XQuery with typeswitch(), one can write imperative XSLT with <xsl:for-each>. Both are legitimate uses of their respective languages, but neither is especially integral to the rest of the general language paradigm.

@ebeshero sums up the comparison well when she writes that “one’s perception of ‘ease’ probably has to do with the idiom one is most accustomed to for thinking.” My own experience of feeling more comfortable with XSLT is similar to @jachinn’s, but I often use small XQuery scripts during development when what I need to explore is too complex to fit in <oXygen/>’s XPath browser widget. When I use XQuery in production to create HTML, I almost always pipe the XQuery result into XSLT before returning it. But that won’t necessarily be the best approach for everyone. For example, if you’re the type of XSLT programmer who thinks every XSLT transformation problem is best solved with <xsl:for-each> (= nobody in our courses!), you’d probably be happier doing everything in XQuery, and that wouldn’t be a wrong choice.

One painful difference between XSLT and XQuery that I think works out in favor of XSLT is that XSLT allows you to set separate default element namespaces for the input and output, which reduces the need to set namespaces explicitly in your code, whether with namespace declarations or prefixes. XQuery supports only one default element namespace, and if you set it, it applies to both input and output. Even if your input is in no namespace and your output is, say, HTML, you’re really using two namespaces, because you need to distinguish “no namepsace” explicitly from the HTML namespace. This is why @rmb165 and @Idi0teque weren’t getting output in their XQuery 3 code; they set the HTML namespace explicitly to govern the output, and therefore could not match elements in no namespace in the input without some awkward hoop-jumping. You can work around the XQuery limitation, but the XSLT facility is cleaner and easier to maintain.

@JosephDRogers23 @jachinn After taking @djbpitt's course in Spring 2013, I remember being struck by how quickly I could generate simple and useful outputs from XQuery and I wanted to explore it more--to my mind, it felt like skating while XSLT was always plodding. I commit far more typos in XSLT than I do in XQuery, for example, trying to figure out where to set those pesky single vs. double quotation marks. There is something to be said for pithy, precise efficiency in one's code.

That said, my experience in developing my own DH courses engaged me in an experiment of a few years running: I wondered whether, really, one could do everything just as well in either XSLT or XQuery. What I found at last was no, not quite. XSLT is far more efficient for "push" processing, and if you recall, that's exactly what we do when we start learning XSLT. What I mean by efficient is simply that you don't have to write as much code to produce an XSLT identity transformation, or an XSLT-to-HTML transformation that captures your XML to display and style just about all of it in a "reading view" as you would have to write in XQuery to perform the same tasks.

I investigated typeswitch() in XQuery just to see whether it made sense to eliminate XSLT in my spring course altogether. I found that no, I could not in good conscience do this, and that I would spend a lot more time explaining how the strange typeswitch function works than I had to do to explain how XSLT works. XSLT as a declarative language has real strengths here.

As for the namespace issues, I think anything that makes us conscious of namespaces is good for us--we shouldn't be taking them for granted anyway, and XQuery just necessitates more namespace prefixes than XSLT does. It's weird but I wouldn't call it painful necessarily--if you've chosen it for your idiom. My advice for members of a project team who ran into the same problem described above (with XML in no namespace) was, perhaps you could create your own project namespace, and then you can create a meaningful prefix for the input that is all your own. All it takes to create a namespace is to write it into a namespace attribute on your XML root element.

In the end, as someone who has conducted a determined experiment of a few years running, I find that I love XQuery for what it does best, pull processing (and it's my preferred idiom by far when I make SVG graphs and when I teach SVG). And I will always put it aside and return to XSLT when I need to make a reading view or an identity transformation.

@JosephDRogers23 One more practical difference between XSLT and XQuery, courtesy of Michael Kay at https://stackoverflow.com/questions/21021770/run-saxon-xquery-over-batch-of-xml-files-and-produce-one-output-file-for-each-in:

The capability to produce multiple output files from a single query is not present in the XQuery language (only in XSLT), and the capability to process a batch of input files is not present in Saxon's XQuery command line (only in the XSLT command line).

eXist has facilities that will let you produce multiple output files from a single XQuery, but XQuery itself doesn’t. And you can process multiple input files in Saxon’s XQuery command line if you get your input not from the command line, but from a collection() expression inside the XQuery. But Saxon’s XSLT lets you batch transform files from the command line with:

saxon -s:directory_a -xsl:stylesheet.xsl -o:directory_b

This transforms each file in directory_a individually using stylesheet.xsl and writes the output into directory_b. Saxon’s XQuery command-line interface doesn’t support specifying directories (rather than individual files) as the input and output values.

obdurodon / dh_course

XSLT vs XQuery #70