Open wshager opened 10 years ago
We would need to create a Resolver that can access the eXist database and configure Saxon to use it.
I have assigned this to eXist-3.0 as (I guess without investigating) it could introduce backwards compatibility issues with those already using various URI in their fn:doc
and fn:collection
statements of XSLT executed with Saxon in eXist.
I am pretty sure such a resolver already eXists; I coded it myself :-) I'll try to find it back :-)
This should work. @wshager can you confirm that this actually do not work in last develop branch?
I'll check, but this issue was opened @adamretter 's request, so perhaps he had different intentions with the resolver, i.e. replacing the current one.
@wshager Do you have tests that show this doesn't work?
AFAIK doc() does work, but collection() doesn't. Here's the stack trace for <xsl:for-each select="collection('xmldb:///db/test/collection')"><xsl:copy-of select="."/></xsl:for-each>
2015-04-08 22:03:53,211 [eXistThread-233] WARN (Transform.java [fatalError]:816) - XSL transform reports fatal error: Error reported by XML parser ; Line#: -1; Column#: -1 net.sf.saxon.trans.XPathException: Error reported by XML parser at net.sf.saxon.lib.StandardErrorHandler.reportError(StandardErrorHandler.java:95) at net.sf.saxon.lib.StandardErrorHandler.fatalError(StandardErrorHandler.java:80) at org.apache.xerces.util.ErrorHandlerWrapper.fatalError(Unknown Source) at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source) at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source) at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source) at org.apache.xerces.impl.XMLVersionDetector.determineDocVersion(Unknown Source) at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source) at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source) at org.apache.xerces.parsers.XMLParser.parse(Unknown Source) at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source) at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source) at net.sf.saxon.event.Sender.sendSAXSource(Sender.java:405) at net.sf.saxon.event.Sender.send(Sender.java:178) at net.sf.saxon.Configuration.buildDocument(Configuration.java:3516) at net.sf.saxon.lib.StandardCollectionURIResolver.catalogContents(StandardCollectionURIResolver.java:236) at net.sf.saxon.lib.StandardCollectionURIResolver.resolve(StandardCollectionURIResolver.java:122) at net.sf.saxon.functions.Collection.iterate(Collection.java:106) at net.sf.saxon.expr.instruct.ForEach.processLeavingTail(ForEach.java:414) at net.sf.saxon.expr.instruct.Template.applyLeavingTail(Template.java:212) at net.sf.saxon.trans.Mode.applyTemplates(Mode.java:1034) at net.sf.saxon.Controller.transformDocument(Controller.java:1959) at net.sf.saxon.TransformerHandlerImpl.endDocument(TransformerHandlerImpl.java:148) at org.exist.util.serializer.ReceiverToSAX.endDocument(ReceiverToSAX.java:85) at org.exist.storage.serializers.XIncludeFilter.endDocument(XIncludeFilter.java:165) at org.exist.storage.serializers.Serializer.toSAX(Serializer.java:931) at org.exist.xquery.functions.transform.Transform.eval(Transform.java:266) at org.exist.xquery.BasicFunction.eval(BasicFunction.java:70) at org.exist.xquery.InternalFunctionCall.eval(InternalFunctionCall.java:56) at org.exist.xquery.AbstractExpression.eval(AbstractExpression.java:71) at org.exist.xquery.PathExpr.eval(PathExpr.java:264) at org.exist.xquery.AbstractExpression.eval(AbstractExpression.java:71) at org.exist.xquery.XQuery.execute(XQuery.java:297) at org.exist.xquery.XQuery.execute(XQuery.java:217) at org.exist.http.servlets.XQueryServlet.process(XQueryServlet.java:491) at org.exist.http.servlets.XQueryServlet.doPost(XQueryServlet.java:197) at javax.servlet.http.HttpServlet.service(HttpServlet.java:755) at javax.servlet.http.HttpServlet.service(HttpServlet.java:848) at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:669) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:457) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:575) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) at org.eclipse.jetty.server.Dispatcher.forward(Dispatcher.java:229) at org.eclipse.jetty.server.Dispatcher.forward(Dispatcher.java:103) at org.exist.http.urlrewrite.Forward.doRewrite(Forward.java:50) at org.exist.http.urlrewrite.XQueryURLRewrite.doRewrite(XQueryURLRewrite.java:556) at org.exist.http.urlrewrite.XQueryURLRewrite.service(XQueryURLRewrite.java:356) at javax.servlet.http.HttpServlet.service(HttpServlet.java:848) at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:669) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1448) at de.betterform.agent.web.filter.XFormsFilter.doFilter(XFormsFilter.java:164) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:533) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116) at org.eclipse.jetty.server.Server.handle(Server.java:368) at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:488) at org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:943) at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:1004) at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:861) at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:240) at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82) at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:628) at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543) at java.lang.Thread.run(Thread.java:745) Caused by: org.xml.sax.SAXParseException: Premature end of file. at org.apache.xerces.util.ErrorHandlerWrapper.createSAXParseException(Unknown Source) ... 79 more
collection()
will never work right? we can stream out XML data, but we can't give saxon access to the exist-db internals. Different products, different domains. It is a fundamental thingy ...
@dizzzz I'd probably never need collection() ...
@dizzzz I don't see why we could not do collection()
. It just resolves a URI to n nodes
Just have it another thought again. It will require changes in saxon.
welll, there you have it already; :-) I did not see it before, but how to deal with a collection like /db
... that would load the whole content of the database into Saxon ! That sounds very wrong to me, good luck for those who will answer the problems over and over again on the mailinglist.
no, a 'per document' approach sounds good to me, we should not just support all wishes from our users.
if we would add the collection()
thing, I'd like to have it configurable, and switched off by default.
IMO these are two separate worlds, we should keep healthy distance.
Just for the record, here is a use case: http://stackoverflow.com/questions/26276705/access-to-filesystem-from-exist-xslt-find-html-myid-html-with-collecti
Second, my lay knowledge/reading stumbled upon a passage in the CollectionURIResolver interface description that Dimitri pointed to, which (perhaps) said that the resolver could return a sequence of URIs to documents and need not necessarily "load the whole content of the database" - but I might have gotten this wrong or it may not mean any simplification after all...
ah, no, the resolver returns a sequence of saxon Items, one doc-node per document, each Item shall contain all data of the referred document.
@dizzzz I think you wrong two times here. 1st The items returned by this iterator must be instances either of xs:anyURI, or of node() (specifically, NodeInfo)
, so it can be sequence of urls. 2nd, if people do stupid things it's not possible to stop doing it, even now there are huge options for that. But if someone understand all effects and know/want to use feature to make code simple why interrupt that person?
eXist decide do not develop his own xsl transformer and deliver/use saxon, if so integration must be complete.
PS Huge amount of java feature that can be used wrongly make me crazy -)
@shabanovd good to know the uri-sequence; that makes it a bit better;
it still does not convince me; it makes the difference between exist-db and saxon more unclear; and why making the 'attack vector' to make existdb crash larger?
and... the 1st next question on exist-open will be: why is saxon collection() operations soooo slow? the fist bug report will appear (since all subsequent operations are again done on xml-byte streams)
With a little bit of work they don't need to be done on byte streams. We could implement NodeInfo with lazy evaluation.
On 9 April 2015 at 12:51, Dannes Wessels notifications@github.com wrote:
and... the 1st next question on exist-open will be: why is saxon collection() operations soooo slow? the fist bug report will appear (since all subsequent operations are again done on xml-byte streams)
— Reply to this email directly or view it on GitHub https://github.com/eXist-db/exist/issues/351#issuecomment-91207030.
Adam Retter
skype: adam.retter tweet: adamretter http://www.adamretter.org.uk
Saxon or other XSLT engines could benefit from resolving docs and collections from eXist's storage directly.