ogrisel / pignlproc

Apache Pig utilities to build training corpora for machine learning / NLP out of public Wikipedia and DBpedia dumps.
158 stars 64 forks source link

Could not instantiate 'pignlproc.storage.UriUriNTriplesLoader' #9

Closed frankscholten closed 9 years ago

frankscholten commented 12 years ago

I am using pig 0.8 and the latest pignlproc.

The 02_dbpedia_article_types.pig script uses a single argument constructor of UriUriNTriplesLoader

USING pignlproc.storage.UriUriNTriplesLoader('http://xmlns.com/foaf/0.1/primaryTopic')

but the current version has three arguments,

public UriUriNTriplesLoader(String propertyUri, String subjectNamespace, String objectNamespace)

so I get the error

could not instantiate 'pignlproc.storage.UriUriNTriplesLoader' with arguments '[http://xmlns.com/foaf/0.1/primaryTopic]'

ogrisel commented 12 years ago

Indeed. I don't plan to work on this in the short term though. I would still be glad to merge any pull request if you need it yourself :)

frankscholten commented 12 years ago

I like to add a pull request for this but I am not that familiar with these RDF specifications. How do I determine the correct values for subject- and object namespaces in this script to pass to the constructor?

ogrisel commented 12 years ago

They are just URI prefix to truncate (e.g. "http://dbpedia.org/resource/" in "http://dbpedia.org/resource/Paris"). This is to avoid manipulating redundant bytes all over the place and make dumping intermediate data in an interactive pig session more readable instead of reading a URI soup.