fadmaa / grefine-rdf-extension

An extension to Google Refine that enables graphical mapping of Google Refine project data to an RDF skeleton and then exporting it in RDF format
http://refine.deri.ie
Other
94 stars 55 forks source link

Some vocabularies not imported #76

Open sparkica opened 11 years ago

sparkica commented 11 years ago

Vocabularies not using rdfs:Class (and rdf:Property) are not imported. For example, DBpedia ontology (http://dbpedia.org/ontology/) is obtained from http://dbpedia.org/data3/.n3, example of triples (no Class or Property): .... dbpedia-owl:sisterStation rdfs:isDefinedBy dbpedia-owl: . dbpedia-owl:runtime rdfs:isDefinedBy dbpedia-owl: . ...

If ontology is downloaded from http://wiki.dbpedia.org/Downloads37 and uploaded from file, Classes and Properties are imported correctly.

frodeseverin commented 11 years ago

I get the same result for importing xsd in the file predefined_vocabularies.tsv

Also true for schema.org, whether imported from http://schema.org/docs/schemaorg.owl or http://schema.rdfs.org/all.rdf

Importing using the GUI works fine.

sparkica commented 11 years ago

I'll look into it and try to find a solution using your examples. If you find more of them, let me know. We'll figure something out.

frodeseverin commented 11 years ago

Sorry, prefix xsd does not seem to work in GUI either. I placed the two auxiliary dtd's in the binary directory, to no avail.

The prefix schema works fine, though.

frodeseverin commented 11 years ago

Note that I do not need the xsd prefix for the time being. It is however present in the predefined vocabularies by default.

sparkica commented 11 years ago

Thanks! Well, the thing is that before I updated the extension prefixes were added to the manager, no matter if they were successfully imported and indexed or not. After the update if no classes and properties are imported, an error message is shown and the vocabulary is not added to the manager, because nothing can be suggested... But now even some predefined vocabularies return error upon importing and cannot be used.

I assume that one library responsible for dereferencing prefixes is outdated and has some old mappings, e.g. mapping http://dbpedia.org/ontology to http://dbpedia.org/data3/.n3, where no classes/properties can be found. I was thinking about adding an option to import, something like 'force add' prefix that would add prefix even if an error occurs during import, so that users can still use it in rdf export... What do you think?

frodeseverin commented 11 years ago

The automatic import seems to be managed by any23. Perhaps this library needs to be updated in the extension?

Manual import seems to be handeled by org.openrdf.rio.

Manual import of xsd prefix first complains about some missing DTD's. I downloaded theese manually, but still I get errors. Full call trace included below.

I suppose these things should be handled separately. Do you know why there are two different libraries in use for importing rdf, i.e any23 and org.openrdf.rio?

Anyway, I think we should try to figure this out. I can help with testing, but I am not tenured in Java code writing.

;)Frode

Call trace for adding xsd in GUI by means of file upload after providing the missing DTD's in OpenRefine root folder.

09:27:15.371 [                   refine] POST /command/rdf-extension/upload-file-add-prefix (238787083ms)
09:27:15.393 [                  command] Error importing vocabulary from file: unqualified attribute 'targetNamespace' not allowed [line 68, column 251] (22ms)
09:27:15.442 [                  project] Loaded project 1838863295629 from disk in 0 sec(s) (49ms)
09:27:15.442 [                  command] Exception caught (0ms)
org.openrdf.rio.RDFParseException: unqualified attribute 'targetNamespace' not allowed [line 68, column 251]
    at org.openrdf.rio.helpers.RDFParserBase.reportError(RDFParserBase.java:464)
    at org.openrdf.rio.rdfxml.RDFXMLParser.reportError(RDFXMLParser.java:1036)
    at org.openrdf.rio.rdfxml.SAXFilter.checkAndCopyAttributes(SAXFilter.java:480)
    at org.openrdf.rio.rdfxml.SAXFilter.startElement(SAXFilter.java:268)
    at org.apache.xerces.parsers.AbstractSAXParser.startElement(Unknown Source)
    at org.apache.xerces.impl.dtd.XMLDTDValidator.startElement(Unknown Source)
    at org.apache.xerces.impl.XMLNSDocumentScannerImpl.scanStartElement(Unknown Source)
    at org.apache.xerces.impl.XMLNSDocumentScannerImpl$NSContentDispatcher.scanRootElementHook(Unknown Source)
    at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source)
    at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
    at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
    at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
    at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
    at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
    at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
    at org.openrdf.rio.rdfxml.RDFXMLParser.parse(RDFXMLParser.java:260)
    at org.openrdf.rio.rdfxml.RDFXMLParser.parse(RDFXMLParser.java:209)
    at org.openrdf.repository.base.RepositoryConnectionBase.addInputStreamOrReader(RepositoryConnectionBase.java:388)
    at org.openrdf.repository.base.RepositoryConnectionBase.add(RepositoryConnectionBase.java:277)
    at org.deri.grefine.rdf.commands.AddPrefixFromFileCommand.doPost(AddPrefixFromFileCommand.java:77)
    at com.google.refine.RefineServlet.service(RefineServlet.java:179)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
    at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
    at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1166)
    at org.mortbay.servlet.UserAgentFilter.doFilter(UserAgentFilter.java:81)
    at org.mortbay.servlet.GzipFilter.doFilter(GzipFilter.java:132)
    at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1157)
    at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:388)
    at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
    at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
    at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:765)
    at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:418)
    at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
    at org.mortbay.jetty.Server.handle(Server.java:326)
    at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
    at org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:938)
    at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:755)
    at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:218)
    at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
    at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
    at java.lang.Thread.run(Thread.java:722)

Call trace for automatic import of xsd after providing the missing DTD's in OpenRefine root folder.

======================= Configuration Properties =======================
any23.http.client.max.connections=5
any23.extraction.metadata.timesize=on
any23.rdfa.extractor.xslt=rdfa.xslt
any23.extraction.csv.comment=#
any23.extraction.head.meta=off
any23.extraction.csv.field=,
any23.microdata.strict=off
any23.http.client.timeout=10000
any23.extraction.metadata.nesting=on
any23.core.version=0.6.1
any23.http.user.agent.default=Any23-CLI
any23.extraction.metadata.domain.per.entity=on
any23.plugin.dirs=./plugins
any23.microdata.ns.default=http://rdf.data-vocabulary.org/
========================================================================
 (1402ms)
09:44:09.986 [..y23.rdf.PopularPrefixes] Loading prefixes from /org/deri/any23/prefixes/prefixes.properties (242ms)
09:44:10.379 [..ingleDocumentExtraction] Processing http://www.w3.org/1999/02/22-rdf-syntax-ns (393ms)
09:44:11.043 [..ingleDocumentExtraction] Processing http://www.w3.org/2000/01/rdf-schema (664ms)
09:44:11.559 [..ingleDocumentExtraction] Processing http://www.w3.org/2002/07/owl (516ms)
09:44:12.841 [..ingleDocumentExtraction] Processing http://xmlns.com/foaf/spec/index.rdf (1282ms)
09:44:13.419 [..ingleDocumentExtraction] Processing http://www.w3.org/2001/XMLSchema (578ms)
09:44:14.612 [..ined_vocabulary_manager] unable to add predefined vocabularies (1193ms)
org.deri.grefine.rdf.vocab.VocabularyImportException: Error importing vocabulary at provided URI.
    at org.deri.grefine.rdf.vocab.imp.VocabularySearcher.importAndIndexVocabulary(VocabularySearcher.java:90)
    at org.deri.grefine.rdf.vocab.imp.VocabularySearcher.importAndIndexVocabulary(VocabularySearcher.java:78)
    at org.deri.grefine.rdf.vocab.imp.PredefinedVocabularyManager.addPredefinedVocabularies(PredefinedVocabularyManager.java:71)
    at org.deri.grefine.rdf.vocab.imp.PredefinedVocabularyManager.<init>(PredefinedVocabularyManager.java:46)
    at org.deri.grefine.rdf.app.ApplicationContext.init(ApplicationContext.java:33)
    at org.deri.grefine.rdf.app.InitilizationCommand.initRdfExportApplicationContext(InitilizationCommand.java:38)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:601)
    at org.mozilla.javascript.MemberBox.invoke(MemberBox.java:161)
    at org.mozilla.javascript.NativeJavaMethod.call(NativeJavaMethod.java:247)
    at org.mozilla.javascript.optimizer.OptRuntime.call1(OptRuntime.java:66)
    at org.mozilla.javascript.gen.c2._c4(file:/home/frodesh/bin/LODOpenRefine/main/webapp/../../extensions/rdf-extension/module/MOD-INF/controller.js:91)
    at org.mozilla.javascript.gen.c2.call(file:/home/frodesh/bin/LODOpenRefine/main/webapp/../../extensions/rdf-extension/module/MOD-INF/controller.js)
    at org.mozilla.javascript.ContextFactory.doTopCall(ContextFactory.java:398)
    at org.mozilla.javascript.ScriptRuntime.doTopCall(ScriptRuntime.java:3065)
    at org.mozilla.javascript.gen.c2.call(file:/home/frodesh/bin/LODOpenRefine/main/webapp/../../extensions/rdf-extension/module/MOD-ICall trace for adding ```xsd``` in GUI by means of file upload after providing the missing ```DTD```'s in OpenRefine root folder.

```shell
NF/controller.js)
    at edu.mit.simile.butterfly.ButterflyModuleImpl.scriptInit(ButterflyModuleImpl.java:636)
    at edu.mit.simile.butterfly.ButterflyModuleImpl.init(ButterflyModuleImpl.java:94)
    at edu.mit.simile.butterfly.Butterfly.initializeModule(Butterfly.java:476)
    at edu.mit.simile.butterfly.Butterfly.configure(Butterfly.java:451)
    at edu.mit.simile.butterfly.Butterfly.init(Butterfly.java:308)
    at org.mortbay.jetty.servlet.ServletHolder.initServlet(ServletHolder.java:440)
    at org.mortbay.jetty.servlet.ServletHolder.doStart(ServletHolder.java:263)
    at com.google.refine.RefineServer.configure(Refine.java:290)
    at com.google.refine.RefineServer.init(Refine.java:204)
    at com.google.refine.Refine.init(Refine.java:116)
    at com.google.refine.Refine.main(Refine.java:110)
Caused by: java.lang.Throwable: No classes or properties have been indexed.
    ... 29 more
MattGyverLee commented 10 years ago

I think i found another instance of this bug. When I try to add the "gold" prefix, it correctly gives the URI "http://purl.org/linguistics/gold/", but after pretending to load, the only entry that can be found in search is "gold:" itself. Adding other prefixes, such as "dc" works perfectly.

Please test this on your end. OpenRefine 2.5, rdf-extension 0.8.0, Windows 7.

If the problem is the format of the Ontology file, I can talk to the developers of the ontology.

Also, how do I import the OWL file manually until this is fixed?