BarryNorton / D2R-LinkedBrainz-Fork

A fork of D2R Server 0.7 for the LinkedBrainz project
http://linkedbrainz.c4dmpresents.org/
GNU General Public License v2.0
7 stars 3 forks source link

Information Service URL pattern #2

Closed zazi closed 13 years ago

zazi commented 13 years ago

The mapping from information service specific page URLs to resource URIs requires a stable URL pattern, i.e. URLs probably have to be cleaned up (e.g. www vs. non-www, country-specific top-level domains, etc.) before they can be transformed into a resource URI.

BarryNorton commented 13 years ago

As extension of comment in issues/1 I'd include all accepted forms as disjuncts and/or use more sophisticated regexs, e.g.: d2rq:condition "musicbrainz.url.url LIKE 'http://%myspace.com%'"

It's important that completely non-compliant URLs (like, sorry, "http://pissfork.net") should be kicked out before calling the translator.

zazi commented 13 years ago

I would prefer to start a clean-up task from the information-service-specific Translator class, since such cleanup tasks can be applied on all information-service-specific base URLs (so we only need to call a cleanUp method with the base URL as parameter). Every non-valid URL that is left, will be excluded from the URL (db value) to URI transformation task, since it won't fit into the given transformation pattern (e.g. original base URI to linked data base URI).

BarryNorton commented 13 years ago

But doesn't the Translator class have to produce a result?

Or is there an exception mechanism that will let the query as a whole proceed?

There's simply nothing that can be done with http://pissfork.net

zazi commented 13 years ago

Yes, I thought that I could excluded non-valid URLs via return 'null' from the translation method. However, this doesn't seem to work. So we have to define another validation condition for the mapping as a URI regex.

zazi commented 13 years ago

It seems that I got it to work (with the help of returning 'null' from the translation method). (the previous error was caused from my clean-up method)

Please check (with fresh D2RS-LinkedBrainz jar, please).

zazi commented 13 years ago

Rechecked? (I close this issue for now). The clean-up method for the base URIs can be found at https://github.com/zazi/D2RS-LinkedBrainz/blob/master/D2RS-LinkedBrainz/src/main/java/translators/util/Utils.java ;)