Closed florent-andre closed 2 years ago
Hi! Thanks for reaching and using our tools!
Unfortunately, this is a long standing issue we haven't been able to properly resolve. In the predecessor of the rmlmapper-java, we had 2 issues about this:
but without a proper resolution. If you have any feedback on how to resolve this properly in the mapping rules, using a CLI parameter, etc. feel free to comment below! We would love to have some feedback on this.
Humm... maybe extract the source's xmlns
and reuse them in xpath
call ?
This require a well formated xml. But it's the minimum...
I don't know how the xpath interpreter
is configurable, but passing the source's xmlns
should be doable.
I think it's better "mapping man" experience than the declarative way of the Carmel implementation seems to do this :
carml:declaresNamespace [
carml:namespacePrefix "edxl-cap" ;
carml:namespaceName "http://release.niem.gov/niem/adapters/edxl-cap/3.0/" ;
Linked to kg-construct/rml-fno-spec#9
Humm... maybe extract the source's xmlns and reuse them in xpath call ?
That might be a possibility to workaround this problem, we always welcome any PRs to help out!
In the meantime, I brought this to the attention of the W3C Community Group working around RML and other mapping language to have a standard like R2RML for transforming heterogeneous data into RDF, see kg-construct/rml-target-source-spec#4
Hi, I can try to have a look, but java is a long time souvenir, and any guidance on the class involved will be appreciated.
Hi @florent-andre
Sure! Happy to assist you :)
To extend the XPath extractor, you probably want to look at getDocumentFromStream
method of XMLRecordFactory
.
There you can configure the DocumentBuilderFactory.
You can also read the InputStream
argument there already to look for XML namespaces.
@DylanVanAssche please find a PR for solving namespaced xpath
.
Please note, that it fix work for full namespaced tree
.
If the xml mix namespaced and not
, this should be explored. See this document for detail about this: "even the default namespace is a namespace, and thus matching names have to be prefixed in XPath".
Another remark:
What do you think about creating an xPathSingleton
to provide the xPath
object and not create multiple instances of it in XMLRecord
and XMLRecordFactory
:
XPath xPath = XPathFactory.newInstance().newXPath();
xPath.setNamespaceContext(new NamespaceResolver(document));
@florent-andre Thanks for the PR! I will have a look next week :)
If the xml mix namespaced and not, this should be explored. See this document for detail about this: "even the default namespace is a namespace, and thus matching names have to be prefixed in XPath".
I'm not that familiar with XML namespaces, but I think this is PR is a good start in general, we can just mention it with a TODO comment that this case is not explored.
What do you think about creating an xPathSingleton to provide the xPath object and not create multiple instances of it in XMLRecord and XMLRecordFactory:
That would actually be better I think... Feel free to try it :) As long as the test cases still pass after this change it is fine.
Regarding avoiding creating multiple xPath
objects, I would strongly advice against using the Singleton pattern, especially because it complicates testing.
Get your point about Singleton. The actual PR don't implement Singleton and "nondependants tests" pass.
As this PR was merged, I close this issue. Thanks guys for building and maintaining this lib !
Hello, First, thanks for this promising tool set. And I hope I send the question on the good canal and repository.
I try to map an xml with namespaces for nodes (an
xsd
type file). When I remove the namespaces from my source file, the test triples are generated. But when I restaure namespace in the xml file and add xsd: ns to xpath, I get an empty set of triples.As I find no example of "xml with namespace" parsing, I ask myself how I can do that.
Here is the example I try to tackle, this can be added to mattey. Thanks for you help, regards