hartig / BlazegraphBasedTPFServer

Triple Pattern Fragment server that uses Blazegraph as backend
Apache License 2.0
13 stars 7 forks source link

Compiling correctly with Sesame RIO dependencies to parse NTriple files as BlazegraphDataSource #3

Open laurensdv opened 8 years ago

laurensdv commented 8 years ago

I tried out with an in-memory blazegraph configuration (modifying the example config and with a subset of dbpedia in nt), but something went wrong and I get this error after running:

java -server -Xmx4g -jar target/BlazegraphBasedTPFServer.jar config.json

HTTP ERROR: 503

Problem accessing /. Reason:

org.eclipse.jetty.servlet.ServletHolder$1: java.lang.NullPointerException

Is there anyway to enable logging or to figure out what triggered this nullpointer?

hartig commented 8 years ago

To enable logging have you tried to put a log4j.properties file in the main directory of the project (i.e., the directory from which you are starting the server)?

The log4j.properties file that I am using looks as follows:

log4j.rootLogger=INFO, stdlog

log4j.appender.stdlog=org.apache.log4j.ConsoleAppender
log4j.appender.stdlog.layout=org.apache.log4j.PatternLayout
log4j.appender.stdlog.layout.ConversionPattern=%d{HH:mm:ss} %-5p %-20c{1} :: %m%n

If this does not give you enough information, you may want to add log4j.logger.com.bigdata=INFO to this file.

Let me know how it goes.

laurensdv commented 8 years ago

Thanks this works.

Now I get to see the actual error, something with the parser that can not be loaded:

org.eclipse.jetty.servlet.ServletHolder$1: org.linkeddatafragments.exceptions.DataSourceCreationException: org.linkeddatafragments.exceptions.DataSourceCreationException: org.openrdf.rio.UnsupportedRDFormatException: No parser factory available for RDF format N-Triples (mimeTypes=text/plain; ext=nt)

hartig commented 8 years ago

Hi Laurens,

It is strange that you are getting this error message. I once had the same error and I fixed it. Can you please verify that the <dependencies> section of the pom.xml file that you use for compiling contains the following:


        <dependency>
            <groupId>org.openrdf.sesame</groupId>
            <artifactId>sesame-rio-api</artifactId>
            <version>2.7.12</version>
        </dependency>
        <dependency>
            <groupId>org.openrdf.sesame</groupId>
            <artifactId>sesame-rio-turtle</artifactId>
            <version>2.7.12</version>
        </dependency>
         <dependency>
            <groupId>org.openrdf.sesame</groupId>
            <artifactId>sesame-rio-ntriples</artifactId>
            <version>2.7.12</version>
        </dependency>
        <dependency>
            <groupId>org.openrdf.sesame</groupId>
            <artifactId>sesame-rio-rdfxml</artifactId>
            <version>2.7.12</version>
        </dependency>
        <dependency>
            <groupId>org.openrdf.sesame</groupId>
            <artifactId>sesame-rio-binary</artifactId>
            <version>2.7.12</version>
        </dependency>
        <dependency>
            <groupId>org.openrdf.sesame</groupId>
            <artifactId>sesame-rio-rdfjson</artifactId>
            <version>2.7.12</version>
        </dependency>

If the pom.xml contains this, can you please do mvn clean and mvn package -U, and send me the full console output of the latter.

(Sorry for replying late, I am currently on a research visit in Chile)

laurensdv commented 8 years ago

Hi,

Didn't work so far, but I attached the console output of the mvn package command.

mvn_output.txt

laurensdv commented 8 years ago

Is there a way to ignore parsing errors in the config.json settings? For ntriple files it should be possible to ignore false triples, no? Now I was able to fix it by adding in the BlazegraphDataSource load class after the connection cxn was made:

cxn.getParserConfig().addNonFatalError(BasicParserSettings.VERIFY_DATATYPE_VALUES);

to avoid it throwing an exception

org.eclipse.jetty.servlet.ServletHolder$1: org.linkeddatafragments.exceptions.DataSourceCreationException: org.linkeddatafragments.exceptions.DataSourceCreationException: org.openrdf.rio.RDFParseException: '1992-01-01T00:00:00+02:00' is not a valid value for datatype http://www.w3.org/2001/XMLSchema#gYear [line 1772587]

I got to this point by changing the packaging setting from 'war' to 'jar'. and adding the following plugin information below the existing plugins:

<plugin>
    <groupId>org.apache.maven.plugins</groupId>
    <artifactId>maven-shade-plugin</artifactId>
    <version>2.3</version>
    <executions>
      <execution>
        <phase>package</phase>
        <goals>
          <goal>shade</goal>
        </goals>
        <configuration>
          <finalName>${project.artifactId}-${project.version}-SHADED</finalName>
          <transformers>
            <transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
              <mainClass>org.linkeddatafragments.standalone.JettyServer</mainClass>
            </transformer>
            <transformer implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"/>
          </transformers>
          <filters>
            <filter>
              <artifact>*:*</artifact>
              <excludes>
                <exclude>META-INF/*.SF</exclude>
                <exclude>META-INF/*.DSA</exclude>
                <exclude>META-INF/*.RSA</exclude>
              </excludes>
            </filter>
          </filters>
        </configuration>
      </execution>
    </executions>
  </plugin>

Now the rio parser classes are included in the SHADED jar.

hartig commented 8 years ago

I do not think that it should necessarily be possible to ignore syntax issues in the NTriple files.

hartig commented 8 years ago

Regarding the pom.xml changes based on which you got it to work, can you please create a PR for it.