clarin-eric / standards

work space for the Standards and Interoperability Committee
https://www.clarin.eu/content/standards
3 stars 15 forks source link

validation errors when editing the otherwise working source #177

Open bansp opened 1 year ago

bansp commented 1 year ago

I get parsing errors from Saxon 11 when opening the source files in oXygen (potentially not a biggie -- it's not written for Saxon, although perhaps there could be a way to placate the parser -- it complains e.g. about the relative placements of declaration and inheritance statements and about built-ins not being declared) but also when I try to validate the files directly with eXist (from within oXygen, using the DB connection). The latter are seemingly minor, like not being able to resolve module references but I can't see an immediate solution given that the eXist documentation suggests that proper path resolution should be taking place, and we see that the system works, so it is taking place. (Maybe the basis for resolution changes for in-eXist calls vs. editing the source, but I would expect the validator to handle such stuff!)

All in all, the entry barriers to our code are not as low as they should be, if we would like this to become a team effort (deriving documentation is another grain of salt, see #176 ). So perhaps, at some point, we can talk about this and potentially divide tasks that could be undertaken to improve on the current state of things. No big rush, but I'm making a note of that.

bansp commented 1 year ago

Eliza doesn't get these errors, so probably Piotr has some wrong settings. Add some screenshots for Eliza to see.

bansp commented 1 year ago

Here's one, where the validation engine is set to Saxon 11.4 HE: image

The last URL points at http://www.w3.org/TR/xpath20/#ERRXPST0003 -- so from an XQuery 3.0 document validation, we are sent to XPath 2.0, and there, it's an EBNF issue, except the EBNF pointed at seems to have little to do with XQuery syntax. I'm lost.

bansp commented 1 year ago

And the next screenshot is after I have reset the validation engine to eXist (6.2.0). I also capture info showing that the paths are correct, so ../modules/menu.xql is where it should be. Of course, we know about that also from the fact that the source is actually running with no problems.

image

I have searched for the error code (XQST0059), and came across a 7-yr old discussion, https://exist-open.narkive.com/MqdVC0Gt/strange-error-in-exist-err-xqst0059 where Joe Wicentowski mentions, among others:

I traced this error code to a series of conditional checks in XQueryContext.java. Here are the two conditions that could lead to this specific "does not refer to an XQuery" message:

  • if 'app.xql' is not a binary file - namely, if the database thinks it's an XML document
  • if 'app.xql' has a mime type other than 'application/xquery'

I checked with eXide that the mime type is correct. I haven't checked the binary status.

bansp commented 1 year ago

Further, if I open (originally) the same file after it has been ingested into eXist, via the DB connection in oXygen, I see no errors there. Which again points to my oXygen as the culprit. Is it a valid hypothesis that something happens upon ingest (mime-type adjustment? but, EBNF is not conditioned by mime-types...) that turns our XQuery from invalid to valid?

Incidentally, I have located the EBNF that Saxon should direct us to: https://www.w3.org/TR/xquery-30/#id-grammar

bansp commented 1 year ago

More on the EBNF, after a quick scan; look at the three bolded elements:

MainModule ::= Prolog QueryBody Prolog ::= ((DefaultNamespaceDecl | Setter | NamespaceDecl | Import) Separator) ((ContextItemDecl | AnnotatedDecl | OptionDecl) Separator)

The following are only pasted for the sake of completeness, there is nothing wrong with them, I think:

NamespaceDecl ::= "declare" "namespace" NCName "=" URILiteral Import ::= SchemaImport | ModuleImport ModuleImport ::= "import" "module" ("namespace" NCName "=")? URILiteral ("at" URILiteral ("," URILiteral)*)? OptionDecl ::= "declare" "option" EQName StringLiteral


We seem to be looking at a sequence of NamespaceDecl | Import followed by OptionDecl -- right?? The source has a different ordering. But when I order as the grammar would wish (if I interpret it correctly), I get the following error:

image

(I'm unable to find anything on "No Configuration available in StandardModuleResolver" on the Web.)

And if I reset the validation engine to eXist again, I get the same error code as before, but a different "String" value, whatever that is:

image

Now, would it be a valid hypothesis that the statement

import module namespace menu = "http://clarin.ids-mannheim.de/standards/menu" at "../modules/menu.xql";

is not fully kosher, but eXist applies some heuristic error-fixing upon ingest?

bansp commented 1 year ago

One final datum is that I have copied the file with the rearranged declarations into eXist, and (1) it validates there and (2) the recommendation page loads. Can't see any error log / console in eXide or else I'd try to say more on that.

And the final thought before I stop and wait for Eliza's reaction: could the culprit be

(: Local Base URL :)
declare variable $app:base as xs:string := "http://localhost:8889/exist/apps/clarin/";

in app.xql, somehow? By creating some kind of mismatch that the static setup in the working directory doesn't cope with but the dynamic setup of the database does? I may be saying complete rubbish here, of course.

bansp commented 1 year ago

I don't think my question above is relevant. However, let me paste two more pieces, from controller.xql, this time:

Saxon: image

eXist: image

One thing that this has in common with the previous case is that variables are declared as part of the AnnotatedDecl substitution, and its ordering in the grammar is after inherit declarations. So, possibly, we're looking at something similar.

After I have moved the variable declarations below the import statement, eXist still produces the same error, but Saxon says something a bit different, because it apparently resolves the inherit:

image

and, in a way, we're back to app.xql again.

margaretha commented 1 year ago

I have investigated the issue and can confirm the problem. It is strange because I have my default validation engine set to Saxon as shown in the picture below: image

and the automatic validation with the default engine is successful, but when the validation engine is set to Saxon in the custom validation scenario as shown below: image

the automatic validation has errors.

Typical errors are namespace declarations that should be declared by default in exist. Quoted from exist doc: https://exist-db.org/exist/apps/doc/xquery

The exist prefix is bound to the namespace http://exist.sourceforge.net/NS/exist. This is declared by default and need not be specified explicitly.

I also found that the files should probably be validated using eXist-db's XQuery engine according to the doc: https://exist-db.org/exist/apps/doc/oxygen. My validation with exist-db localhost has no errors, also for controller.xql.

margaretha commented 1 year ago

I also tried to fix the errors from Saxon by including the namespaces, but somehow it still cannot find a function that should be there.

image

bansp commented 1 year ago

I have in the meantime received some hints from @daliboris , which I'm sharing below (with thanks! :-)). My question was about how to placate the parser when it complains about namespaces for undeclared internal modules.

See here:

https://github.com/eeditiones/tei-publisher-app/blob/master/modules/custom-api.xql

1) relative paths to XQuery files

import module namespace config=["http://www.tei-c.org/tei-simple/config"](http://www.tei-c.org/tei-simple/config) at "config.xqm";

import module namespace dapi=["http://teipublisher.com/api/documents"](http://teipublisher.com/api/documents) at "lib/api/document.xql";

2) importing libraries thet are installed in eXist-db as module

import module namespace errors = ["http://e-editiones.org/roaster/errors"](http://e-editiones.org/roaster/errors);
import module namespace rutil=["http://e-editiones.org/roaster/util"](http://e-editiones.org/roaster/util);

Hope this helps,

Boris
bansp commented 1 year ago

I would like to maintain the priority label here, because while it's not a blocker, it's probably a clear case of need-of-maintenance, to rearrange the top-level statements according to the standard grammar, even if eXist groks the ungrammatical ordering. Otherwise it's going to surface with the first person who clones the repo in order to help out.

bansp commented 12 months ago

Let's work on this one in a separate branch.

margaretha commented 10 months ago

I have investigated the issue further and found that the problem is indeed caused by missing exist-libraries. I have tried to import exist libraries with different methods as follows but none of them works unfortunately.

  1. Following the suggestion from Boris doesn't work.
  2. Trying all possible import statements according to the exist documentation https://exist-db.org/exist/apps/doc/xquery#module-system do not work because Saxon cannot locate exist libraries.
import module namespace request="http://exist-db.org/xquery/request" 
at "http://exist-db.org/xquery/request"; 

produces the following error because there is the URL doesn't exist:

image

Using keyword java: or resource: doesn't work because they are not recognized by Saxon.

import module namespace request="http://exist-db.org/xquery/request" 
at "java:org.exist.xquery.functions.request.RequestModule";

produces the following error

image

  1. Simply declaring the namespace because the eXist modules we use are pre-loaded modules, e.g.
    declare namespace request=["http://exist-db.org/xquery/request"]
    (http://exist-db.org/xquery/request);

produces the following error:

Cannot find a 2-argument function named Q{[http://exist-db.org/ ]
(http://exist-db.org/%E2%80%8A) xquery/request} get-parameter()
  1. Including exist library in the validation scenario doesn't make any change.
  1. Running saxon manually on a terminal produces the same errors, e.g.
java -cp saxon-he-11.6.jar:/opt/existdb/6.2.0/lib/exist-core-6.2.0.jar net.sf.saxon.Query -q:test.xq
Static error on line 34 column 33 of test.xq:
  XPST0017  Cannot find a 2-argument function named
  Q{http://exist-db.org/xquery/request}get-parameter()
Static error(s) in query
daliboris commented 8 months ago

Maybe load-query-module function could help you with this issue, but it reqires Saxon Enterprise Edition. It's part of XQuery 3.1, it is available also in eXist-db but is missing in BaseX.

I just noticed that you use Saxon EE in your tests, so that might be the way to go.

Documentation and examples:

margaretha commented 7 months ago

@daliboris Thank you for your suggestion! I have tried using the load-query-module, it works for fn library but unfortunately not for exist libraries.

declare variable $menu:fn-module := load-xquery-module("http://www.w3.org/2005/xpath-functions");
declare variable $menu:request-module := load-xquery-module("http://exist-db.org/xquery/request");
let $current-date := $menu:fn-module?functions(xs:QName('fn:current-date'))?0()

This one works as expected.

let $server-name := $menu:request-module?functions?(xs:QName("request:get-server-name"))?0():)

This one returns the following error

HTTP ERROR 500 javax.servlet.ServletException: javax.servlet.ServletException: An error occurred while processing request to /exist/apps/clarin/: An error occurred: Cannot invoke "org.exist.source.Source.pathOrShortIdentifier()" because the return value of "org.exist.xquery.UserDefinedFunction.getSource()" is null
URI:    /exist/apps/clarin/
STATUS: 500
MESSAGE:    javax.servlet.ServletException: javax.servlet.ServletException: An error occurred while processing request to /exist/apps/clarin/: An error occurred: Cannot invoke "org.exist.source.Source.pathOrShortIdentifier()" because the return value of "org.exist.xquery.UserDefinedFunction.getSource()" is null
SERVLET:    XQueryURLRewrite
CAUSED BY:  javax.servlet.ServletException: javax.servlet.ServletException: An error occurred while processing request to /exist/apps/clarin/: An error occurred: Cannot invoke "org.exist.source.Source.pathOrShortIdentifier()" because the return value of "org.exist.xquery.UserDefinedFunction.getSource()" is null
CAUSED BY:  javax.servlet.ServletException: An error occurred while processing request to /exist/apps/clarin/: An error occurred: Cannot invoke "org.exist.source.Source.pathOrShortIdentifier()" because the return value of "org.exist.xquery.UserDefinedFunction.getSource()" is null
CAUSED BY:  javax.servlet.ServletException: An error occurred: Cannot invoke "org.exist.source.Source.pathOrShortIdentifier()" because the return value of "org.exist.xquery.UserDefinedFunction.getSource()" is null
CAUSED BY:  java.lang.NullPointerException: Cannot invoke "org.exist.source.Source.pathOrShortIdentifier()" because the return value of "org.exist.xquery.UserDefinedFunction.getSource()" is null
Caused by:

javax.servlet.ServletException: javax.servlet.ServletException: An error occurred while processing request to /exist/apps/clarin/: An error occurred: Cannot invoke "org.exist.source.Source.pathOrShortIdentifier()" because the return value of "org.exist.xquery.UserDefinedFunction.getSource()" is null
    at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:162)
    at org.eclipse.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:772)
    at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
    at org.eclipse.jetty.server.Server.handle(Server.java:516)
    at org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:487)
    at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:732)
    at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:479)
    at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:277)
    at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)
    at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:105)
    at org.eclipse.jetty.io.ChannelEndPoint$1.run(ChannelEndPoint.java:104)
    at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:338)
    at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:315)
    at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:173)
    at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produce(EatWhatYouKill.java:137)
    at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:883)
    at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1034)
    at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: javax.servlet.ServletException: An error occurred while processing request to /exist/apps/clarin/: An error occurred: Cannot invoke "org.exist.source.Source.pathOrShortIdentifier()" because the return value of "org.exist.xquery.UserDefinedFunction.getSource()" is null
    at org.exist.http.urlrewrite.XQueryURLRewrite.service(XQueryURLRewrite.java:366)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
    at org.eclipse.jetty.servlet.ServletHolder$NotAsync.service(ServletHolder.java:1450)
    at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:799)
    at org.eclipse.jetty.servlet.ServletHandler$ChainEnd.doFilter(ServletHandler.java:1656)
    at org.eclipse.jetty.websocket.server.WebSocketUpgradeFilter.doFilter(WebSocketUpgradeFilter.java:292)
    at org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193)
    at org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1626)
    at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:552)
    at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
    at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:571)
    at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
    at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235)
    at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1624)
    at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)
    at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1440)
    at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188)
    at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:505)
    at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1594)
    at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186)
    at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1355)
    at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
    at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:234)
    at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:146)
    ... 17 more
Caused by: javax.servlet.ServletException: An error occurred: Cannot invoke "org.exist.source.Source.pathOrShortIdentifier()" because the return value of "org.exist.xquery.UserDefinedFunction.getSource()" is null
    at org.exist.http.servlets.EXistServlet.doGet(EXistServlet.java:363)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:687)
    at org.exist.http.servlets.EXistServlet.service(EXistServlet.java:588)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
    at org.eclipse.jetty.servlet.ServletHolder$NotAsync.service(ServletHolder.java:1459)
    at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:799)
    at org.eclipse.jetty.servlet.ServletHandler$ChainEnd.doFilter(ServletHandler.java:1656)
    at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:552)
    at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
    at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:618)
    at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
    at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235)
    at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1624)
    at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)
    at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1440)
    at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188)
    at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:505)
    at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1594)
    at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186)
    at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1355)
    at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
    at org.eclipse.jetty.server.Dispatcher.forward(Dispatcher.java:167)
    at org.eclipse.jetty.server.Dispatcher.forward(Dispatcher.java:81)
    at org.exist.http.urlrewrite.Forward.doRewrite(Forward.java:50)
    at org.exist.http.urlrewrite.XQueryURLRewrite.doRewrite(XQueryURLRewrite.java:523)
    at org.exist.http.urlrewrite.XQueryURLRewrite.service(XQueryURLRewrite.java:340)
    ... 40 more
Caused by: java.lang.NullPointerException: Cannot invoke "org.exist.source.Source.pathOrShortIdentifier()" because the return value of "org.exist.xquery.UserDefinedFunction.getSource()" is null
    at org.exist.xquery.XPathException.addFunctionCall(XPathException.java:368)
    at org.exist.xquery.FunctionCall.evalFunction(FunctionCall.java:314)
    at org.exist.xquery.FunctionCall.eval(FunctionCall.java:207)
    at org.exist.xquery.value.FunctionReference.eval(FunctionReference.java:125)
    at org.exist.xquery.DynamicFunctionCall.eval(DynamicFunctionCall.java:116)
    at org.exist.xquery.LetExpr.eval(LetExpr.java:98)
    at org.exist.xquery.UserDefinedFunction.eval(UserDefinedFunction.java:161)
    at org.exist.xquery.FunctionCall.evalFunction(FunctionCall.java:289)
    at org.exist.xquery.FunctionCall.eval(FunctionCall.java:207)
    at org.exist.xquery.AbstractExpression.eval(AbstractExpression.java:71)
    at org.exist.xquery.PathExpr.eval(PathExpr.java:280)
    at org.exist.xquery.SequenceConstructor.eval(SequenceConstructor.java:82)
    at org.exist.xquery.UserDefinedFunction.eval(UserDefinedFunction.java:161)
    at org.exist.xquery.FunctionCall.evalFunction(FunctionCall.java:289)
    at org.exist.xquery.FunctionCall.eval(FunctionCall.java:207)
    at org.exist.xquery.AbstractExpression.eval(AbstractExpression.java:71)
    at org.exist.xquery.PathExpr.eval(PathExpr.java:280)
    at org.exist.xquery.EnclosedExpr.eval(EnclosedExpr.java:80)
    at org.exist.xquery.AbstractExpression.eval(AbstractExpression.java:71)
    at org.exist.xquery.PathExpr.eval(PathExpr.java:280)
    at org.exist.xquery.ElementConstructor.eval(ElementConstructor.java:330)
    at org.exist.xquery.AbstractExpression.eval(AbstractExpression.java:71)
    at org.exist.xquery.PathExpr.eval(PathExpr.java:280)
    at org.exist.xquery.ElementConstructor.eval(ElementConstructor.java:330)
    at org.exist.xquery.AbstractExpression.eval(AbstractExpression.java:71)
    at org.exist.xquery.PathExpr.eval(PathExpr.java:280)
    at org.exist.xquery.ElementConstructor.eval(ElementConstructor.java:330)
    at org.exist.xquery.AbstractExpression.eval(AbstractExpression.java:71)
    at org.exist.xquery.PathExpr.eval(PathExpr.java:280)
    at org.exist.xquery.AbstractExpression.eval(AbstractExpression.java:71)
    at org.exist.xquery.XQuery.execute(XQuery.java:445)
    at org.exist.xquery.XQuery.execute(XQuery.java:348)
    at org.exist.xquery.XQuery.execute(XQuery.java:335)
    at org.exist.http.RESTServer.executeXQuery(RESTServer.java:1570)
    at org.exist.http.RESTServer.doGet(RESTServer.java:528)
    at org.exist.http.servlets.EXistServlet.doGet(EXistServlet.java:324)
    ... 65 more

[Powered by Jetty:// 9.4.50.v20221201](https://eclipse.org/jetty)
bansp commented 5 months ago

We are thinking of labelling this as a "wontfix", sadly.

bansp commented 4 months ago

An attempt at a save throw: let us first try to chat with @ljo about this, at some point when an opportunity arises.

daliboris commented 4 months ago

It seems that Leif-Jöran will meet XML Prage this year, but also Jurij Leino. I'll try asking them about this issue.

bansp commented 4 months ago

Thank you, Boris, I appreciate that. I can imagine that both colleagues may not be able to allocate time to go through the whole of this by now extensive issue, but the most pressing points are summarised towards the end of it -- it is now a matter of whether it is doable to push for the code's compatibility with both the Saxon and the eXist implementations of XQuery, or whether we should abandon hope in this regard and consequently wontfix this issue.

bansp commented 3 months ago

Since this issue is real, as I can confirm every time I edit our XQuery files in oXygen (with Saxon as the validating engine), and we've spent a lot of work on testing its various nuances, I'd rather not close it. But, realistically, I should take it out of the upcoming milestone, so that's what I'm going to do.