Closed rjrudin closed 3 years ago
Good catch. Here's a guess as to what's going on.
Starting in 10.0-5, the Java API converts the query to a cts.query once during initialization instead of on every request.
In the com.marklogic.client.datamovement.impl.QueryBatcherImpl#QueryBatcherImpl() constructor on line 99, the cts.query serialization is captured.
Somewhere, the conversion to the cts.query (possibly within the REST API internal endpoint) loses the namespace binding.
Based on investigation...
Initialization converts the Search API representation of a path range query to the cts representation, which is serialized to JSON before returning to the client.
A cts.pathRangeQuery() doesn't take namespace declarations, so it serializes to JSON without namespace declarations.
By contrast, a cts.pathReference() does take namespace declarations, which are serialized to JSON.
A cts.rangeQuery() takes a cts.pathReference(), so one way to fix the issue would be to modify the conversion from the Search API representation to the cts representation in this case. That approach, however, would risk introducing a backward incompatibility on a stable component.
Another way to solve the problem would be to serialize to XML if namespaces are used and to JSON otherwise. That approach, however, would add complexity to both the interface and implementation of the REST API.
An expedient solution is to use the original query for a structured query builder query with namespaces or for a raw query in XML format. The optimization will be skipped for such queries.
The fix also skips the optimization and uses the original query if the query refers to persisted options.
If the functional tests have a path range indexes with namespace, a good functional test would use a query batcher to get some results.
So we can address your issue, please include the following:
Version of MarkLogic Java Client API
5.3.2
Version of MarkLogic Server
10.0-5
Java version
Java 8 and 11
OS and version
N/A
Input: Some code to illustrate the problem, preferably in a state that can be independently reproduced on our end
Below is a sample program to expose the bug. I have a path range index set up correctly on "/root/nst:dateTime" with "nst" declared as a path namespace in the database. And I have 6 documents that match the query (this is all from a marklogic-nifi test). Using queryManager.search, I get back the expected 6 documents. Using QueryBatcher, I get an error due to the namespace prefix not being recognized.
Actual output: What did you observe? What errors did you see? Can you attach the logs? (Java logs, MarkLogic logs)
Here's the output of the queryManager.search (just a snippet to verify I get data back):
And here's the error I got from using QueryBatcher:
Expected output: What specifically did you expect to happen?
I expected QueryBatcher to find the same 6 documents
Alternatives: What else have you tried, actual/expected?
No workaround that I can find.