SmartDataAnalytics / SemWeb2NL

Semantic Web related concepts converted to Natural language
GNU General Public License v3.0
44 stars 16 forks source link

[AVATAR] Trying to get verbalization for Leipzig Botanical Garden throws #8

Closed yamalight closed 8 years ago

yamalight commented 8 years ago

Trying to get verbalization for http://dbpedia.org/resource/Leipzig_Botanical_Garden with DBpedia endpoint throws the following error:

java.util.NoSuchElementException
    at java.util.TreeMap.key(TreeMap.java:1323)
    at java.util.TreeMap.firstKey(TreeMap.java:290)
    at java.util.TreeSet.first(TreeSet.java:394)
    at org.aksw.avatar.Verbalizer.getMostSpecificType(Verbalizer.java:580)
    at org.aksw.avatar.Verbalizer.summarize(Verbalizer.java:504)
    at AppKt$main$1.handle(App.kt:18)
    at AppKt$main$1.handle(App.kt)
    at spark.RouteImpl$1.handle(RouteImpl.java:58)
    at spark.webserver.MatcherFilter.doFilter(MatcherFilter.java:162)
    at spark.webserver.JettyHandler.doHandle(JettyHandler.java:61)
    at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:189)
    at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
    at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:119)
    at org.eclipse.jetty.server.Server.handle(Server.java:517)
    at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:302)
    at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:242)
    at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:245)
    at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
    at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:75)
    at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceAndRun(ExecuteProduceConsume.java:246)
    at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:156)
    at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:654)
    at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:572)
    at java.lang.Thread.run(Thread.java:745)

java.util.NoSuchElementException
    at java.util.TreeMap.key(TreeMap.java:1323)
    at java.util.TreeMap.firstKey(TreeMap.java:290)
    at java.util.TreeSet.first(TreeSet.java:394)
    at org.aksw.avatar.Verbalizer.getMostSpecificType(Verbalizer.java:580)
    at org.aksw.avatar.Verbalizer.summarize(Verbalizer.java:504)
    at AppKt$main$1.handle(App.kt:18)
    at AppKt$main$1.handle(App.kt)
    at spark.RouteImpl$1.handle(RouteImpl.java:58)
    at spark.webserver.MatcherFilter.doFilter(MatcherFilter.java:162)
    at spark.webserver.JettyHandler.doHandle(JettyHandler.java:61)
    at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:189)
    at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
    at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:119)
    at org.eclipse.jetty.server.Server.handle(Server.java:517)
    at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:302)
    at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:242)
    at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:245)
    at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
    at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:75)
    at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceAndRun(ExecuteProduceConsume.java:246)
    at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:156)
    at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:654)
    at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:572)
    at java.lang.Thread.run(Thread.java:745)

Error seems to be 100% reproducible.

ngonga commented 8 years ago

Please use resources and not pages :)

On 11/04/16 16:59, Tim Ermilov wrote:

Trying to get verbalization for |http://dbpedia.org/page/Leipzig_Botanical_Garden| with DBpedia endpoint throws the following error:

|java.util.NoSuchElementException at java.util.TreeMap.key(TreeMap.java:1323) at java.util.TreeMap.firstKey(TreeMap.java:290) at java.util.TreeSet.first(TreeSet.java:394) at org.aksw.avatar.Verbalizer.getMostSpecificType(Verbalizer.java:580) at org.aksw.avatar.Verbalizer.summarize(Verbalizer.java:504) at AppKt$main$1.handle(App.kt:18) at AppKt$main$1.handle(App.kt) at spark.RouteImpl$1.handle(RouteImpl.java:58) at spark.webserver.MatcherFilter.doFilter(MatcherFilter.java:162) at spark.webserver.JettyHandler.doHandle(JettyHandler.java:61) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:189) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:119) at org.eclipse.jetty.server.Server.handle(Server.java:517) at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:302) at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:242) at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:245) at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95) at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:75) at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceAndRun(ExecuteProduceConsume.java:246) at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:156) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:654) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:572) at java.lang.Thread.run(Thread.java:745) java.util.NoSuchElementException at java.util.TreeMap.key(TreeMap.java:1323) at java.util.TreeMap.firstKey(TreeMap.java:290) at java.util.TreeSet.first(TreeSet.java:394) at org.aksw.avatar.Verbalizer.getMostSpecificType(Verbalizer.java:580) at org.aksw.avatar.Verbalizer.summarize(Verbalizer.java:504) at AppKt$main$1.handle(App.kt:18) at AppKt$main$1.handle(App.kt) at spark.RouteImpl$1.handle(RouteImpl.java:58) at spark.webserver.MatcherFilter.doFilter(MatcherFilter.java:162) at spark.webserver.JettyHandler.doHandle(JettyHandler.java:61) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:189) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:119) at org.eclipse.jetty.server.Server.handle(Server.java:517) at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:302) at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:242) at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:245) at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95) at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:75) at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceAndRun(ExecuteProduceConsume.java:246) at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:156) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:654) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:572) at java.lang.Thread.run(Thread.java:745) |

Error seems to be 100% reproducible.

— You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub https://github.com/AKSW/SemWeb2NL/issues/8

yamalight commented 8 years ago

sorry, copy-pasted URI from wrong place. I am using http://dbpedia.org/resource/Leipzig_Botanical_Garden

edit: updated the original ticket text

ngonga commented 8 years ago

I think the reason is simply that it is not a place according to the DBpedia triples [1]. Hence, the approach cannot determine the right properties to use for the generation of the verbalization. We'll fix that in the future by using an improved DBpedia internally. Thanks for the hint.

Cheers, Axel

[1] http://dbpedia.org/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=select+distinct+%3FConcept+where+{%3Chttp%3A%2F%2Fdbpedia.org%2Fresource%2FLeipzig_Botanical_Garden%3E+a+%3FConcept}+LIMIT+100&format=text%2Fhtml&CXML_redir_for_subjs=121&CXML_redir_for_hrefs=&timeout=30000&debug=on

On 11/04/16 17:04, Tim Ermilov wrote:

sorry, copy-pasted URI from wrong place. I am using `|http://dbpedia.org/resource/Leipzig_Botanical_Garden|

— You are receiving this because you commented. Reply to this email directly or view it on GitHub https://github.com/AKSW/SemWeb2NL/issues/8#issuecomment-208389708

yamalight commented 8 years ago

I caught another resource with same error (will extend the list if I'll catch more):

LorenzBuehmann commented 8 years ago

The problem is that those resources do not belong to a class from the DBpedia ontology but are only members of some YAGO classes:

http://dbpedia.org/class/yago/Arboretum102733075 http://dbpedia.org/class/yago/Artifact100021939 http://dbpedia.org/class/yago/BotanicalGardensInGermany http://dbpedia.org/class/yago/Facility103315023 http://dbpedia.org/class/yago/Garden103417345 http://dbpedia.org/class/yago/Location100027167 http://dbpedia.org/class/yago/Object100002684 http://dbpedia.org/class/yago/Plot108674739 http://dbpedia.org/class/yago/Region108630985 http://dbpedia.org/class/yago/Tract108673395 http://dbpedia.org/class/yago/Whole100003553 http://dbpedia.org/class/yago/YagoGeoEntity http://dbpedia.org/class/yago/YagoLegalActorGeo http://www.w3.org/2003/01/geo/wgs84_pos#SpatialThing http://dbpedia.org/class/yago/YagoPermanentlyLocatedEntity http://dbpedia.org/class/yago/GardensInSaxony http://dbpedia.org/class/yago/GeographicalArea108574314 http://dbpedia.org/class/yago/PhysicalEntity100001930

The idea was to use (one of) the most specific class during the verbalization, in doing so we query explicitly for OWL classes, i.e. there must exists a triple ?cls a owl:Class . in the knowledge base:

select distinct ?type where { 
<http://dbpedia.org/resource/Leipzig_Botanical_Garden> a ?type .
?type a owl:Class .
filter not exists {?subtype ^a <http://dbpedia.org/resource/Leipzig_Botanical_Garden> ; rdfs:subClassOf ?type .filter(?subtype != ?type)}
}

This is obviously too strict but was done to improve the performance on DBpedia as the YAGO hierarchy is quite huge. I'll remove it for now as we probably cannot assume that such a triple exists in most of the RDF knowledge bases. The result would be then:

http://dbpedia.org/class/yago/BotanicalGardensInGermany http://www.w3.org/2003/01/geo/wgs84_pos#SpatialThing http://dbpedia.org/class/yago/GardensInSaxony

Yet, we still have to pick one out of those classes, currently done more or less randomly, future work could use some relevance resp. prominence measure, if exist.

yamalight commented 8 years ago

@LorenzBuehmann I think just throwing an error that describes this might be enough. current error does not really explain anything :|

LorenzBuehmann commented 8 years ago

Right, but this wasn't an error that I expected that's why no exception was thrown. Nevertheless, this problem is solved now.