SANSA-Stack / Archived-SANSA-Examples

Usage examples for the SANSA Stack
http://sansa-stack.net
Apache License 2.0
38 stars 13 forks source link

[SANSA-Examples-Spark] Exception in thread "main" java.lang.NoSuchMethodError #25

Closed Wisenheim closed 6 years ago

Wisenheim commented 6 years ago

Hello all, I'm trying to run the Sparqlify example following the running instructions

./spark-2.2.1-bin-hadoop2.7/bin/spark-submit --class net.sansa_stack.examples.spark.query.Sparqlify --master spark://spark-master:7077 SANSA-Examples/sansa-examples-spark/target/sansa-examples-spark_2.11-2017-12.1-SNAPSHOT.jar -i src/main/resources/rdf.nt

Exception in thread "main" java.lang.NoSuchMethodError: net.sansa_stack.rdf.spark.io.NTripleReader$.load$default$3()Lscala/Enumeration$Value; at net.sansa_stack.rdf.spark.io.package$RDFReader$$anonfun$ntriples$4.apply(package.scala:205) at net.sansa_stack.rdf.spark.io.package$RDFReader$$anonfun$ntriples$4.apply(package.scala:204) at net.sansa_stack.examples.spark.query.Sparqlify$.run(Sparqlify.scala:44) at net.sansa_stack.examples.spark.query.Sparqlify$.main(Sparqlify.scala:22) at net.sansa_stack.examples.spark.query.Sparqlify.main(Sparqlify.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:755) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

Thanks in advance

GezimSejdiu commented 6 years ago

Hi @Wisenheim , many thanks for trying out the Sparqlify example on SANSA. As far I can judge you are using a master branch instead for building the jar. Could you please, try it using the develop branch by doing these steps:

git clone https://github.com/SANSA-Stack/SANSA-Examples.git 
cd SANSA-Examples
mvn install
cd sansa-examples-spark
mvn package

and then if you want to expose the SAPRQL endpoint for running your SPARQL queries you can just submit it via :

spark-submit --class net.sansa_stack.examples.spark.query.Sparqlify --master spark://gezim-Latitude-E5550:7077 target/sansa-examples-spark_2.11-2017-12.1-SNAPSHOT.jar -i src/main/resources/rdf.nt

or just executing it on the cli you can do in this way:

spark-submit --class net.sansa_stack.examples.spark.query.Sparqlify --master spark://gezim-Latitude-E5550:7077 target/sansa-examples-spark_2.11-2017-12.1-SNAPSHOT.jar -i src/main/resources/rdf.nt -q "SELECT * WHERE {?s ?p ?o} LIMIT 10" -e false

which besically run this part of the codes.

After I ran it I get this result :

18/06/12 13:36:26 INFO CodeGenerator: Code generated in 19.244854 ms
[http://commons.dbpedia.org/property/artist,null,Jean Broc,http://commons.dbpedia.org/resource/File:The_Death_of_Hyacinthos.gif,en,null,null,null]
[http://commons.dbpedia.org/property/date,null,null,http://commons.dbpedia.org/resource/File:Buswachten.jpg,null,null,2004-07-22,null]
[http://commons.dbpedia.org/property/date,null,null,http://commons.dbpedia.org/resource/File:Groninger-museum.jpg,null,null,2004-08-26,null]
[http://commons.dbpedia.org/property/date,null,null,http://commons.dbpedia.org/resource/File:StationAssen3.jpg,null,null,2004-07-22,null]
[http://commons.dbpedia.org/property/date,null,null,http://commons.dbpedia.org/resource/File:De_Slegte,_Groningen.jpg,null,null,2004-08-26,null]
[http://commons.dbpedia.org/property/date,null,null,http://commons.dbpedia.org/resource/File:Paddestoel_003.jpg,null,null,2004-08-20,null]
[http://commons.dbpedia.org/property/date,null,null,http://commons.dbpedia.org/resource/File:BordUtrecht.jpg,null,null,2004-07-22,null]
[http://commons.dbpedia.org/property/date,null,null,http://commons.dbpedia.org/resource/File:Paddestoel_002.jpg,null,null,2004-08-20,null]
[http://commons.dbpedia.org/property/date,null,null,http://commons.dbpedia.org/resource/File:Groningen_003.jpg,null,null,2004-08-26,null]
[http://commons.dbpedia.org/property/date,null,null,http://commons.dbpedia.org/resource/File:StationAssen2.jpg,null,null,2004-07-22,null]

Please, let me know if you need more help on how to run this example, will be more than happy to help you.

Best regards,

Wisenheim commented 6 years ago

Thank you very much @GezimSejdiu that works, I was using the previous version of spark. I have another question, is there an example with an rdf/xml data ? the data which I'm working on, actually it's an rdf/xml data with the Cidoc-crm Ontology

thanks in advance best regards

GezimSejdiu commented 6 years ago

Hi @Wisenheim , great, this is a great news. Regarding your question about the RDF/XML reader example, yes sure we provide that RDF serialization format as well on SANSA. Please, use the snippets below to read such ontology:

import net.sansa_stack.rdf.spark.io._
import org.apache.jena.riot.Lang

val input = "<yourpath>/Cidoc-crm.owl"

val lang = Lang.RDFXML
val triples = spark.rdf(lang)(input)
triples.take(5).foreach(println(_))

Let me know if you need more help on this or is there any other issue that we have to look on.

Best regards,

Wisenheim commented 6 years ago

Thank you @GezimSejdiu for you kindness, I ran the code on the spark-shell but it did not work for me, is there some configuration I must do before ? my spark Session does not have an rdf member.

error: value rdf is not a member of org.apache.spark.sql.SparkSession
       val triples = spark.rdf(lang)(input)

Do you have any full code example ? Thanks again Best regards

GezimSejdiu commented 6 years ago

Welcome :) That is strange! Did you also import the io._ package on the shell? If not, please do import these packages as well :

import net.sansa_stack.rdf.spark.io._
import org.apache.jena.riot.Lang

and let me know if that still persist. I could just easy reproduce the issue.

Best,

GezimSejdiu commented 6 years ago

Hi @Wisenheim once again! Wanted to test it by my self just to be sure that I was not missing anything from your comment. And yes, it works :) , what you were missing on your shell was that you haven't include the external jars to Spark, which in this case is the bundled jar of SANSA. Please, have a look below and see how it works.

Start the shell with jars included :

spark-shell --master spark://gezim-Latitude-E5550:7077 --jars $SPARK_HOME/SANSA-Examples/sansa-examples-spark/target/sansa-examples-spark_2.11-2017-12.1-SNAPSHOT.jar

Then when you are on scala shell just use the same snippets as above and it should work. See the message as a reference:

Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 2.2.0
      /_/

Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_171)
Type in expressions to have them evaluated.
Type :help for more information.

scala> import net.sansa_stack.rdf.spark.io._
import net.sansa_stack.rdf.spark.io._

scala> import org.apache.jena.riot.Lang
import org.apache.jena.riot.Lang

scala> val input = "ecrm_170309.owl"
input: String = ecrm_170309.owl

scala> val lang = Lang.RDFXML
lang: org.apache.jena.riot.Lang = Lang:RDF/XML

scala> val triples = spark.rdf(lang)(input)
triples: org.apache.spark.rdd.RDD[org.apache.jena.graph.Triple] = MapPartitionsRDD[1] at map at package.scala:233

scala> triples.take(5).foreach(println(_))
http://erlangen-crm.org/170309/ @rdf:type owl:Ontology
http://erlangen-crm.org/170309/ @owl:versionInfo "ECRM 170309 / CIDOC-CRM 6.2.2"
http://erlangen-crm.org/170309/ @rdfs:comment "Changelog: https://github.com/erlangen-crm/ecrm/commits/master"@en
http://erlangen-crm.org/170309/ @rdfs:comment "Erlangen CRM / OWL - An OWL DL 1.0 implementation of the CIDOC Conceptual Reference Model, based on: Nick Crofts, Martin Doerr, Tony Gill, Stephen Stead, Matthew Stiff (eds.): Definition of the CIDOC Conceptual Reference Model (http://cidoc-crm.org/).
This implementation has been originally created by Bernhard Schiemann, Martin Oischinger and Günther Görz at the Friedrich-Alexander-University of Erlangen-Nuremberg, Department of Computer Science, Chair of Computer Science 8 (Artificial Intelligence) in cooperation with the Department of Museum Informatics of the Germanisches Nationalmuseum Nuremberg and the Department of Biodiversity Informatics of the Zoologisches Forschungsmuseum Alexander Koenig Bonn.
The Erlangen CRM / OWL implementation of the CIDOC Conceptual Reference Model is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License."@en
http://erlangen-crm.org/170309/ @rdfs:label "Erlangen CRM / OWL"@en

I hope this will solve your issue! Feel free to ask for more help on setting up SANSA on your environment.

Best regards,

Wisenheim commented 6 years ago

Yes, I imported both of the packages, still cannot find the rdf member is the spark session, here all the steps I do: I run the 3rd version of both spark-shell and SANSA RDF Layer ./spark-2.3.1-bin-hadoop2.7/bin/spark-shell --jars spark-2.3.1-bin-hadoop2.7/jars/sansa-rdf-spark-bundle_2.11-0.3.0-jar-with-dependencies.jar

then I run the below code lines:

import net.sansa_stack.rdf.spark.io._
import org.apache.jena.riot.Lang

val input = "sitar_rdf.rdf"
val lang = Lang.RDFXML

val triples = spark.rdf(lang)(input)

and in the last code line, I get the error, by the way I'm usign the default spark session , should I edit the configuration ? if yes, how ?

thank you for you patience

best regards

GezimSejdiu commented 6 years ago

Great! Please, instead of using directly RDF layer as an external jar use sansa-examples which is an uber-jar which contains all the dependencies for all the SANSA layer.

So instead of using

./spark-2.3.1-bin-hadoop2.7/bin/spark-shell --jars spark-2.3.1-bin-hadoop2.7/jars/sansa-rdf-spark-bundle_2.11-0.3.0-jar-with-dependencies.jar

you should use

./spark-2.3.1-bin-hadoop2.7/bin/spark-shell --jars spark-2.3.1-bin-hadoop2.7/jars/sansa-examples-spark_2.11-2017-12.1-SNAPSHOT.jar

And yes, these implicit are introduced on the upcoming version of SANSA 0.4 which we will announce on the release notes as well. So SANSA v0.3.0 does not have such implicit and you will not be able to call it, even you are using the bundle jar which has all the dependencies.

Please, could you do it by using sansa-examples-spark bundle jar and let me know.

GezimSejdiu commented 6 years ago

b.t.w you can use the pre-build jar already: https://github.com/SANSA-Stack/SANSA-Examples/releases/tag/develop !

Wisenheim commented 6 years ago

Thank you @GezimSejdiu it works perfectly with sansa-examples jar, last question, as I was trying to do since we started this issue to run some SPARQL Queries on an rdf data. how I can run some SPARQL Queries on the loaded RDF/XML data ? Thanks again

best regards

GezimSejdiu commented 6 years ago

Good to hear that! Yes, after you are able to read the data from RDF/XML into RDD[Triple] our query engine can accept SPARQL queries on the console as well. Please, use the code below to do so:

import net.sansa_stack.query.spark.query._

val sparqlQuery = """SELECT ?s ?p ?o
                    WHERE {?s ?p ?o }
                    LIMIT 10"""
val result = triples.sparql(sparqlQuery)  //this return dataframe of the result sets

PS: Please, take into account that only these SPARQL features are supported on this query engine (since it uses Sparqlify as a SPARQL--to--SQL rewriter).

Best,

Wisenheim commented 6 years ago

Thank you @GezimSejdiu , I tried that, but I've got back an error in the last code line

Welcome to                                                                                                                                           
      ____              __                                                                                                                           
     / __/__  ___ _____/ /__                                                                                                                         
    _\ \/ _ \/ _ `/ __/  '_/                                                                                                                         
   /___/ .__/\_,_/_/ /_/\_\   version 2.3.1                                                                                                          
      /_/                                                                                                                                            

Using Scala version 2.11.8 (OpenJDK 64-Bit Server VM, Java 1.8.0_171)                                                                                
Type in expressions to have them evaluated.                                                                                                          
Type :help for more information.                                                                                                                     

scala> import net.sansa_stack.rdf.spark.io._
import net.sansa_stack.rdf.spark.io._

scala> import net.sansa_stack.query.spark.query._
import net.sansa_stack.query.spark.query._

scala> import org.apache.jena.riot.Lang
import org.apache.jena.riot.Lang

scala> val input = "sitar_rdf.rdf"
input: String = sitar_rdf.rdf

scala> val lang = Lang.RDFXML
lang: org.apache.jena.riot.Lang = Lang:RDF/XML

scala> val triples = spark.rdf(lang)(input)
triples: org.apache.spark.rdd.RDD[org.apache.jena.graph.Triple] = MapPartitionsRDD[3] at map at package.scala:233

scala> val sparqlQuery = """ SELECT ?s ?p ?o WHERE { ?s ?p ?o } LIMIT 10"""
sparqlQuery: String = " SELECT ?s ?p ?o WHERE { ?s ?p ?o } LIMIT 10"

scala> val result = triples.sparql(sparqlQuery)
java.lang.NoSuchMethodError: net.sansa_stack.rdf.spark.partition.core.RdfPartitionUtilsSpark$.partitionGraph$default$2()Lnet/sansa_stack/rdf/common/partition/core/RdfPartitionerDefault$;
  at net.sansa_stack.rdf.spark.partition.package$RDFPartition.partitionGraph(package.scala:37)
  at net.sansa_stack.query.spark.query.package$SparqlifyAsDefault.sparql(package.scala:34)
  ... 57 elided

thanks again best regards

GezimSejdiu commented 6 years ago

Hi @Wisenheim , strange! I just tested it out and it works how it is supposed to work (see the message below) :

scala> import net.sansa_stack.query.spark.query._
import net.sansa_stack.query.spark.query._

scala> val sparqlQuery = """ SELECT * WHERE { ?s ?p ?o } LIMIT 10"""
sparqlQuery: String = " SELECT * WHERE { ?s ?p ?o } LIMIT 10"

scala> val result = triples.sparql(sparqlQuery)
18/06/18 13:58:12 WARN TypeSystemImpl: Skipping: date, date
18/06/18 13:58:12 WARN TypeSystemImpl: Skipping: integer, integer
18/06/18 13:58:12 WARN TypeSystemImpl: Skipping: float, float
18/06/18 13:58:12 WARN TypeSystemImpl: Skipping: geography, geography
18/06/18 13:58:12 WARN TypeSystemImpl: Skipping: geometry, geometry
18/06/18 13:58:12 WARN TypeSystemImpl: Skipping: timestamp, timestamp
18/06/18 13:58:13 WARN TypeSystemImpl: Skipping: date, date
18/06/18 13:58:13 WARN TypeSystemImpl: Skipping: integer, integer
18/06/18 13:58:13 WARN TypeSystemImpl: Skipping: float, float
18/06/18 13:58:13 WARN TypeSystemImpl: Skipping: geography, geography
18/06/18 13:58:13 WARN TypeSystemImpl: Skipping: geometry, geometry
18/06/18 13:58:13 WARN TypeSystemImpl: Skipping: timestamp, timestamp
18/06/18 13:58:13 WARN SchemaProvider: Using ugly hack for adding a limit
18/06/18 13:58:13 WARN SchemaProvider: Using ugly hack for adding a limit
18/06/18 13:58:13 WARN SchemaProvider: Using ugly hack for adding a limit
18/06/18 13:58:13 WARN SchemaProvider: Using ugly hack for adding a limit
18/06/18 13:58:13 WARN SchemaProvider: Using ugly hack for adding a limit
18/06/18 13:58:13 WARN SchemaProvider: Using ugly hack for adding a limit
18/06/18 13:58:13 WARN SchemaProvider: Using ugly hack for adding a limit
18/06/18 13:58:13 WARN SchemaProvider: Using ugly hack for adding a limit
18/06/18 13:58:13 WARN SchemaProvider: Using ugly hack for adding a limit
18/06/18 13:58:13 WARN SchemaProvider: Using ugly hack for adding a limit
18/06/18 13:58:13 WARN SchemaProvider: Using ugly hack for adding a limit
18/06/18 13:58:13 WARN SchemaProvider: Using ugly hack for adding a limit
18/06/18 13:58:13 WARN SchemaProvider: Using ugly hack for adding a limit
18/06/18 13:58:13 WARN SchemaProvider: Using ugly hack for adding a limit
18/06/18 13:58:13 WARN SchemaProvider: Using ugly hack for adding a limit
18/06/18 13:58:13 WARN SchemaProvider: Using ugly hack for adding a limit
18/06/18 13:58:13 WARN SchemaProvider: Using ugly hack for adding a limit
18/06/18 13:58:13 WARN SchemaProvider: Using ugly hack for adding a limit
18/06/18 13:58:13 WARN SchemaProvider: Using ugly hack for adding a limit
18/06/18 13:58:13 WARN CandidateViewSelectorBase: JENA'S ALGEBRA OPTIMIZATION DISABLED
TODO Get rid of reflection for replacement - its slow!
TODO Get rid of reflection for replacement - its slow!
TODO Get rid of reflection for replacement - its slow!
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO bigint
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO bigint
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO bigint
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO bigint
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO bigint
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO bigint
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO bigint
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO bigint
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO bigint
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO bigint
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO bigint
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO bigint
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO bigint
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO bigint
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO bigint
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO bigint
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO bigint
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO string
CAST TO bigint
CAST TO string
result: org.apache.spark.sql.DataFrame = [C_3: string, C_4: string ... 7 more fields]

scala> result.rdd.collect.foreach(println(_))
[http://www.w3.org/2002/07/owl#cardinality,null,a0ff58f2-6b05-464e-b8e2-1720611145c5,null,1,null,null,null,null]
[http://www.w3.org/2002/07/owl#cardinality,null,5e86a782-e757-4cd3-8446-c709d8c03767,null,1,null,null,null,null]
[http://www.w3.org/2002/07/owl#cardinality,null,5631c7f0-65c7-4465-820b-3083fe71c823,null,1,null,null,null,null]
[http://www.w3.org/2002/07/owl#cardinality,null,f934aa68-33f1-4256-9738-2e5939701759,null,1,null,null,null,null]
[http://www.w3.org/2002/07/owl#cardinality,null,ad136e4e-d138-4f04-89d2-d97667e713ac,null,1,null,null,null,null]
[http://www.w3.org/2002/07/owl#cardinality,null,57ed40c2-2595-42ed-bbb8-df733a51c363,null,1,null,null,null,null]
[http://www.w3.org/2002/07/owl#cardinality,null,6164e8cc-1086-4787-b086-81dd67ff64bd,null,1,null,null,null,null]
[http://www.w3.org/2002/07/owl#onProperty,null,84b560b2-d0f0-41cc-8adc-99a1f062328c,null,null,null,http://erlangen-crm.org/170309/P106_is_composed_of,null,null]
[http://www.w3.org/2002/07/owl#onProperty,null,a92a9bc6-ea20-4a60-8e8b-3ff9d1210319,null,null,null,http://erlangen-crm.org/170309/P135_created_type,null,null]
[http://www.w3.org/2002/07/owl#onProperty,null,4576424c-d74b-4cbc-b531-43ada72f367d,null,null,null,http://erlangen-crm.org/170309/P92_brought_into_existence,null,null]

Please, once more, are you using the same jar as we discussed on our previous threads ?.

Let me know if again the partition.core persist. If so then you can just import the package by doing :

import net.sansa_stack.rdf.spark.partition._

Best regards,

Wisenheim commented 6 years ago

Hello @GezimSejdiu yes, I'm using the same jar we discussed before, I did also the last import, but it stills the same error


 ./spark-2.3.1-bin-hadoop2.7/bin/spark-shell --jars SANSA-Examples/sansa-examples-spark/target/sansa-examples-spark_2.11-2017-12.1-SNAPSHOT.jar 

Spark context Web UI available at http://xitan:4040
Spark context available as 'sc' (master = local[*], app id = local-1529356065343).
Spark session available as 'spark'.
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 2.3.1
      /_/

Using Scala version 2.11.8 (OpenJDK 64-Bit Server VM, Java 1.8.0_171)
Type in expressions to have them evaluated.
Type :help for more information.

scala> import net.sansa_stack.rdf.spark.io._
import net.sansa_stack.rdf.spark.io._

scala> import net.sansa_stack.query.spark.query._
import net.sansa_stack.query.spark.query._

scala> import net.sansa_stack.rdf.partition._
import net.sansa_stack.rdf.partition._

scala> import org.apache.jena.riot.Lang
import org.apache.jena.riot.Lang

scala> val input = "sitar_rdf.rdf"
input: String = sitar_rdf.rdf

scala> val lang = Lang.RDFXML
lang: org.apache.jena.riot.Lang = Lang:RDF/XML

scala> val triples = spark.rdf(lang)(input)
triples: org.apache.spark.rdd.RDD[org.apache.jena.graph.Triple] = MapPartitionsRDD[1] at map at package.scala:233

scala> val sparqlQuery = """ SELECT ?s ?p ?o WHERE { ?s ?p ?o } LIMIT 10"""
sparqlQuery: String = " SELECT ?s ?p ?o WHERE { ?s ?p ?o } LIMIT 10"

scala> val result = triples.sparql(sparqlQuery)
java.lang.NoSuchMethodError: net.sansa_stack.rdf.spark.partition.core.RdfPartitionUtilsSpark$.partitionGraph$default$2()Lnet/sansa_stack/rdf/common/partition/core/RdfPartitionerDefault$;
  at net.sansa_stack.rdf.spark.partition.package$RDFPartition.partitionGraph(package.scala:37)
  at net.sansa_stack.query.spark.query.package$SparqlifyAsDefault.sparql(package.scala:34)
  ... 55 elided

it might be the structure of my data, here a sample of it


<rdf:RDF
  xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:gml="http://www.opengis.net/gml"
    xmlns:archaeo="http://www.ics.forth.gr/isl/CRMext/CRMarchaeo.rdfs/"
    xmlns:crm="http://www.cidoc-crm.org/cidoc-crm/"
    xmlns:sci="http://www.ics.forth.gr/isl/CRMext/CRMsci.rdfs/"
    xmlns:skos="http://www.w3.org/2004/02/skos/core#"
    xmlns:crmdig="http://www.ics.forth.gr/isl/CRMext/CRMdig.rdfs/"
    xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#">
  <crm:E4_Period rdf:about="http://archaeositarproject.it/st_named_year_range_#59">
    <crm:P4_has_time-span>
      <crm:E52_Time-Span>
        <crm:P82b_end_of_the_end rdf:datatype="http://www.w3.org/2001/XMLSchema#gYear"
        >-961</crm:P82b_end_of_the_end>
        <crm:P81b_begin_of_the_end rdf:datatype="http://www.w3.org/2001/XMLSchema#gYear"
        >-961</crm:P81b_begin_of_the_end>
        <crm:P81a_end_of_the_begin rdf:datatype="http://www.w3.org/2001/XMLSchema#gYear"
        >-1200</crm:P81a_end_of_the_begin>
        <crm:P82a_begin_of_the_begin rdf:datatype="http://www.w3.org/2001/XMLSchema#gYear"
        >-1200</crm:P82a_begin_of_the_begin>
      </crm:E52_Time-Span>
    </crm:P4_has_time-span>
    <crm:P1_is_identified_by>
      <crm:E49_Time_Appellation>
        <rdfs:label xml:lang="ita">Età del Bronzo Finale</rdfs:label>
      </crm:E49_Time_Appellation>
    </crm:P1_is_identified_by>
  </crm:E4_Period>
  <crm:E4_Period rdf:about="http://archaeositarproject.it/st_named_year_range_#58">
    <crm:P4_has_time-span>
      <crm:E52_Time-Span>
        <crm:P82b_end_of_the_end rdf:datatype="http://www.w3.org/2001/XMLSchema#gYear"
        >-1201</crm:P82b_end_of_the_end>
        <crm:P81b_begin_of_the_end rdf:datatype="http://www.w3.org/2001/XMLSchema#gYear"
        >-1201</crm:P81b_begin_of_the_end>
        <crm:P81a_end_of_the_begin rdf:datatype="http://www.w3.org/2001/XMLSchema#gYear"
        >-1250</crm:P81a_end_of_the_begin>
        <crm:P82a_begin_of_the_begin rdf:datatype="http://www.w3.org/2001/XMLSchema#gYear"
        >-1250</crm:P82a_begin_of_the_begin>
      </crm:E52_Time-Span>
    </crm:P4_has_time-span>
    <crm:P1_is_identified_by>
      <crm:E49_Time_Appellation>
        <rdfs:label xml:lang="ita">Età del Bronzo Recente, fase II</rdfs:label>
      </crm:E49_Time_Appellation>
    </crm:P1_is_identified_by>
  </crm:E4_Period>
  <crm:E4_Period rdf:about="http://archaeositarproject.it/st_named_year_range_#57">
    <crm:P4_has_time-span>
      <crm:E52_Time-Span>
        <crm:P82b_end_of_the_end rdf:datatype="http://www.w3.org/2001/XMLSchema#gYear"
        >-1251</crm:P82b_end_of_the_end>
        <crm:P81b_begin_of_the_end rdf:datatype="http://www.w3.org/2001/XMLSchema#gYear"
        >-1251</crm:P81b_begin_of_the_end>
        <crm:P81a_end_of_the_begin rdf:datatype="http://www.w3.org/2001/XMLSchema#gYear"
        >-1350</crm:P81a_end_of_the_begin>
        <crm:P82a_begin_of_the_begin rdf:datatype="http://www.w3.org/2001/XMLSchema#gYear"
        >-1350</crm:P82a_begin_of_the_begin>
      </crm:E52_Time-Span>
    </crm:P4_has_time-span>
    <crm:P1_is_identified_by>
      <crm:E49_Time_Appellation>
        <rdfs:label xml:lang="ita">Età del Bronzo Recente, fase I</rdfs:label>
      </crm:E49_Time_Appellation>
    </crm:P1_is_identified_by>
  </crm:E4_Period>
...
</rdf:RDF>

could it be some PREFIX issue ?

Thanks alot best,

GezimSejdiu commented 6 years ago

I think it is not related to the prefixes or the data itself, is more related to the package inside the jar. Could you do one more thing by doing force install all dependencies.

cd SANSA-Example
mvn -U install
cd sansa-examples-spark
mvn package

and attach the jar to the Spark.

Let me know if this resolved the issue, otherwise, we have to test it with clean environment (without ./m2 on my computer).

Best,

Wisenheim commented 6 years ago

Thank you @GezimSejdiu, but it's still not working. best,

GezimSejdiu commented 6 years ago

Hi @Wisenheim , not sure what is missing on your configurations but I just used the same dataset you shared above and re-run the same steps and it works.

gezim@gezim-Latitude-E5550:~/spark/spark-2.2.0-bin-hadoop2.7/SANSA-Examples$ spark-shell --master spark://gezim-Latitude-E5550:7077 --jars $SPARK_HOME/SANSA-Examples/sansa-examples-spark/target/sansa-examples-spark_2.11-2017-12.1-SNAPSHOT.jar
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
18/06/22 12:57:04 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
18/06/22 12:57:04 WARN Utils: Your hostname, gezim-Latitude-E5550 resolves to a loopback address: 127.0.1.1; using 10.148.252.163 instead (on interface wlp2s0)
18/06/22 12:57:04 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
18/06/22 12:57:12 WARN ObjectStore: Failed to get database global_temp, returning NoSuchObjectException
Spark context Web UI available at http://10.148.252.163:4040
Spark context available as 'sc' (master = spark://gezim-Latitude-E5550:7077, app id = app-20180622125705-0001).
Spark session available as 'spark'.
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 2.2.0
      /_/

Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_171)
Type in expressions to have them evaluated.
Type :help for more information.

scala> import net.sansa_stack.rdf.spark.io._
import net.sansa_stack.rdf.spark.io._

scala> import net.sansa_stack.query.spark.query._
import net.sansa_stack.query.spark.query._

scala> import org.apache.jena.riot.Lang
import org.apache.jena.riot.Lang

scala> val input = "sitar_rdf.rdf"
input: String = sitar_rdf.rdf

scala> val lang = Lang.RDFXML
lang: org.apache.jena.riot.Lang = Lang:RDF/XML

scala> val triples = spark.rdf(lang)(input)
triples: org.apache.spark.rdd.RDD[org.apache.jena.graph.Triple] = MapPartitionsRDD[1] at map at package.scala:233

scala> val sparqlQuery = """ SELECT ?s ?p ?o WHERE { ?s ?p ?o } LIMIT 10"""
sparqlQuery: String = " SELECT ?s ?p ?o WHERE { ?s ?p ?o } LIMIT 10"

scala>  val result = triples.sparql(sparqlQuery)
18/06/22 12:57:57 WARN TypeSystemImpl: Skipping: date, date
18/06/22 12:57:57 WARN TypeSystemImpl: Skipping: integer, integer
18/06/22 12:57:57 WARN TypeSystemImpl: Skipping: float, float
18/06/22 12:57:57 WARN TypeSystemImpl: Skipping: geography, geography
18/06/22 12:57:57 WARN TypeSystemImpl: Skipping: geometry, geometry
18/06/22 12:57:57 WARN TypeSystemImpl: Skipping: timestamp, timestamp
18/06/22 12:57:58 WARN TypeSystemImpl: Skipping: date, date
18/06/22 12:57:58 WARN TypeSystemImpl: Skipping: integer, integer
18/06/22 12:57:58 WARN TypeSystemImpl: Skipping: float, float
18/06/22 12:57:58 WARN TypeSystemImpl: Skipping: geography, geography
18/06/22 12:57:58 WARN TypeSystemImpl: Skipping: geometry, geometry
18/06/22 12:57:58 WARN TypeSystemImpl: Skipping: timestamp, timestamp
18/06/22 12:57:58 WARN SchemaProvider: Using ugly hack for adding a limit
18/06/22 12:57:59 WARN SchemaProvider: Using ugly hack for adding a limit
18/06/22 12:57:59 WARN SchemaProvider: Using ugly hack for adding a limit
18/06/22 12:57:59 WARN SchemaProvider: Using ugly hack for adding a limit
18/06/22 12:57:59 WARN SchemaProvider: Using ugly hack for adding a limit
18/06/22 12:57:59 WARN SchemaProvider: Using ugly hack for adding a limit
18/06/22 12:57:59 WARN SchemaProvider: Using ugly hack for adding a limit
18/06/22 12:57:59 WARN SchemaProvider: Using ugly hack for adding a limit
18/06/22 12:57:59 WARN SchemaProvider: Using ugly hack for adding a limit
18/06/22 12:57:59 WARN CandidateViewSelectorBase: JENA'S ALGEBRA OPTIMIZATION DISABLED
TODO Get rid of reflection for replacement - its slow!
TODO Get rid of reflection for replacement - its slow!
TODO Get rid of reflection for replacement - its slow!
CAST TO string
result: org.apache.spark.sql.DataFrame = [C_3: string, C_4: string ... 6 more fields]

scala> result.take(5).foreach(println)
[http://www.cidoc-crm.org/cidoc-crm/P4_has_time-span,null,null,http://archaeositarproject.it/st_named_year_range_#59,null,5e8199dd-fef4-4855-9c9b-74d5418ad9d4,null,null]
[http://www.cidoc-crm.org/cidoc-crm/P4_has_time-span,null,null,http://archaeositarproject.it/st_named_year_range_#58,null,6a7b44a6-0cc6-4c8a-a2e0-fc8704b1fdc8,null,null]
[http://www.cidoc-crm.org/cidoc-crm/P4_has_time-span,null,null,http://archaeositarproject.it/st_named_year_range_#57,null,11c4ee84-2d98-44f5-9d45-596504240a0d,null,null]
[http://www.cidoc-crm.org/cidoc-crm/P1_is_identified_by,null,null,http://archaeositarproject.it/st_named_year_range_#59,null,6503ea8b-5930-40d3-bbf1-5099278180d2,null,null]
[http://www.cidoc-crm.org/cidoc-crm/P1_is_identified_by,null,null,http://archaeositarproject.it/st_named_year_range_#58,null,93594ad8-795f-4d85-a8d5-20f85a25eaf1,null,null]

I'm going to close this issue for now, but feel free to comment and maybe let us know the full steps how you did it (starting from building the jar and submiting to the Spark).

Best regards,

Wisenheim commented 6 years ago

Thank you anyway @GezimSejdiu , still not working, I'm trying to run the instruction from the beginning but now it even gives a building error, here is the log

[INFO] ------------------------------------------------------------------------
[INFO] Building SANSA Examples - Apache Spark 2017-12.1-SNAPSHOT
[INFO] ------------------------------------------------------------------------
[WARNING] The POM for commons-codec:commons-codec:jar:1.9-SNAPSHOT is missing, no dependency information available
[WARNING] The POM for commons-codec:commons-codec:jar:1.10-SNAPSHOT is missing, no dependency information available
[INFO] 
[INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ sansa-examples-spark_2.11 ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 12 resources
[INFO] 
[INFO] --- maven-compiler-plugin:3.5.1:compile (default-compile) @ sansa-examples-spark_2.11 ---
[INFO] Nothing to compile - all classes are up to date
[INFO] 
[INFO] --- scala-maven-plugin:3.2.1:compile (default) @ sansa-examples-spark_2.11 ---
[WARNING]  Expected all dependencies to require Scala version: 2.11.11
[WARNING]  net.sansa-stack:sansa-examples-spark_2.11:2017-12.1-SNAPSHOT requires scala version: 2.11.11
[WARNING]  com.twitter:chill_2.11:0.8.4 requires scala version: 2.11.11
[WARNING]  org.apache.spark:spark-core_2.11:2.3.1 requires scala version: 2.11.11
[WARNING]  org.json4s:json4s-jackson_2.11:3.2.11 requires scala version: 2.11.11
[WARNING]  org.json4s:json4s-core_2.11:3.2.11 requires scala version: 2.11.11
[WARNING]  org.json4s:json4s-ast_2.11:3.2.11 requires scala version: 2.11.11
[WARNING]  org.json4s:json4s-core_2.11:3.2.11 requires scala version: 2.11.0
[WARNING] Multiple versions of scala libraries detected!
[INFO] Using incremental compilation
[INFO] Compiling 18 Scala sources to /home/xitan/Documenti/sparkin/SANSA-Examples/sansa-examples-spark/target/classes...
[INFO] 
[INFO] --- maven-resources-plugin:2.6:testResources (default-testResources) @ sansa-examples-spark_2.11 ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory /home/xitan/Documenti/sparkin/SANSA-Examples/sansa-examples-spark/src/test/resources
[INFO] 
[INFO] --- maven-compiler-plugin:3.5.1:testCompile (default-testCompile) @ sansa-examples-spark_2.11 ---
[INFO] Nothing to compile - all classes are up to date
[INFO] 
[INFO] --- scala-maven-plugin:3.2.1:testCompile (default) @ sansa-examples-spark_2.11 ---
[WARNING]  Expected all dependencies to require Scala version: 2.11.11
[WARNING]  net.sansa-stack:sansa-examples-spark_2.11:2017-12.1-SNAPSHOT requires scala version: 2.11.11
[WARNING]  com.twitter:chill_2.11:0.8.4 requires scala version: 2.11.11
[WARNING]  org.apache.spark:spark-core_2.11:2.3.1 requires scala version: 2.11.11
[WARNING]  org.json4s:json4s-jackson_2.11:3.2.11 requires scala version: 2.11.11
[WARNING]  org.json4s:json4s-core_2.11:3.2.11 requires scala version: 2.11.11
[WARNING]  org.json4s:json4s-ast_2.11:3.2.11 requires scala version: 2.11.11
[WARNING]  org.json4s:json4s-core_2.11:3.2.11 requires scala version: 2.11.0
[WARNING] Multiple versions of scala libraries detected!
[INFO] Using incremental compilation
[INFO] Compiling 1 Scala source to /home/xitan/Documenti/sparkin/SANSA-Examples/sansa-examples-spark/target/test-classes...
[INFO] 
[INFO] --- maven-surefire-plugin:2.19.1:test (default-test) @ sansa-examples-spark_2.11 ---
[INFO] Tests are skipped.
[INFO] 
[INFO] --- scalatest-maven-plugin:1.0:test (test) @ sansa-examples-spark_2.11 ---
Discovery starting.
Discovery completed in 3 seconds, 90 milliseconds.
Run starting. Expected test count is: 1
SparqlServerTests:
log4j:WARN No appenders could be found for logger (Jena).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
giu 23, 2018 9:45:01 PM org.glassfish.jersey.server.spring.SpringComponentProvider bind
GRAVE: None or multiple beans found in Spring context for type class org.aksw.jena_sparql_api.web.servlets.ServletSparqlServiceImpl, skipping the type.
- starting the default SPARQL server should succeed *** FAILED ***
  org.apache.jena.sparql.engine.http.QueryExceptionHTTP: HTTP 503 error making the query: Service Unavailable
  at org.apache.jena.sparql.engine.http.HttpQuery.rewrap(HttpQuery.java:371)
  at org.apache.jena.sparql.engine.http.HttpQuery.execGet(HttpQuery.java:337)
  at org.apache.jena.sparql.engine.http.HttpQuery.exec(HttpQuery.java:288)
  at org.apache.jena.sparql.engine.http.QueryEngineHTTP.execConstructWorker(QueryEngineHTTP.java:465)
  at org.apache.jena.sparql.engine.http.QueryEngineHTTP.execModel(QueryEngineHTTP.java:428)
  at org.apache.jena.sparql.engine.http.QueryEngineHTTP.execConstruct(QueryEngineHTTP.java:389)
  at org.apache.jena.sparql.engine.http.QueryEngineHTTP.execConstruct(QueryEngineHTTP.java:384)
  at org.apache.jena.rdfconnection.RDFConnection.lambda$queryConstruct$2(RDFConnection.java:144)
  at org.apache.jena.system.Txn.calculateRead(Txn.java:56)
  at org.apache.jena.rdfconnection.RDFConnection.queryConstruct(RDFConnection.java:142)
  ...
Run completed in 32 seconds, 649 milliseconds.
Total number of tests run: 1
Suites: completed 2, aborted 0
Tests: succeeded 0, failed 1, canceled 0, ignored 0, pending 0
*** 1 TEST FAILED ***
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO] 
[INFO] SANSA-Examples - Parent ............................ SUCCESS [  1.501 s]
[INFO] SANSA Examples - Apache Flink ...................... SUCCESS [02:25 min]
[INFO] SANSA Examples - Apache Spark ...................... FAILURE [02:59 min]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 05:27 min
[INFO] Finished at: 2018-06-23T21:45:10+02:00
[INFO] Final Memory: 54M/423M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.scalatest:scalatest-maven-plugin:1.0:test (test) on project sansa-examples-spark_2.11: There are test failures -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
[ERROR] 
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn <goals> -rf :sansa-examples-spark_2.11

before at least it did the project building best regards,

GezimSejdiu commented 6 years ago

ummm, many thanks for letting us know that there is a build error! It is because of the test class added by Claus (@Aklakan ) which was related to the issue #24. I guess he will remove this class, right @Aklakan . Anyway, we do not need any test classes on the examples.

Please, use this as a workaround until we remove that class, or we fix the issue #24 .

mvn install -DskipTests
cd sansa-examples-spark
mvn package -DskipTests

It will skip the tests and will build.

Best regards,