blazegraph / database

Blazegraph High Performance Graph Database
GNU General Public License v2.0
887 stars 172 forks source link

Blazegraph 2.1.4 debian package installed on Ubuntu 16.04 - fails to restart #52

Open dazza-codes opened 7 years ago

dazza-codes commented 7 years ago

Followed the instructions in the blazegraph-deb section to build and install 2.1.4 and it worked at first. But after trying to load a lot of data the system froze. On restart, blazegraph will not restart using sudo service blazegraph restart (although a following status indicates it is running OK). The log shows the failure to start is a com.bigdata.util.ChecksumError -- is this a bug or a feature?

INFO: com.bigdata.util.config.LogUtil: Configure and watch: /etc/blazegraph/log4j.properties

BlazeGraph(TM) Graph Engine

                   Flexible
                   Reliable
                  Affordable
      Web-Scale Computing for the Enterprise

Copyright SYSTAP, LLC DBA Blazegraph 2006-2016.  All rights reserved.

sul-dlweber-ubuntu
Fri Feb 17 20:59:16 PST 2017
Linux/4.4.0-62-generic amd64
Intel(R) Core(TM) i7-3720QM CPU @ 2.60GHz Family 6 Model 58 Stepping 9, GenuineIntel #CPU=2
Oracle Corporation 1.7.0_80
freeMemory=123735856
buildVersion=2.1.4
gitCommit=738d05f08cffd319233a4bfbb0ec2a858e260f9c

Dependency         License                                                                 
ICU                http://source.icu-project.org/repos/icu/icu/trunk/license.html          
bigdata-ganglia    http://www.apache.org/licenses/LICENSE-2.0.html                         
blueprints-core    https://github.com/tinkerpop/blueprints/blob/master/LICENSE.txt         
colt               http://acs.lbl.gov/software/colt/license.html                           
commons-codec      http://www.apache.org/licenses/LICENSE-2.0.html                         
commons-fileupload http://www.apache.org/licenses/LICENSE-2.0.html                         
commons-io         http://www.apache.org/licenses/LICENSE-2.0.html                         
commons-logging    http://www.apache.org/licenses/LICENSE-2.0.html                         
dsiutils           http://www.gnu.org/licenses/lgpl-2.1.html                               
fastutil           http://www.apache.org/licenses/LICENSE-2.0.html                         
flot               http://www.opensource.org/licenses/mit-license.php                      
high-scale-lib     http://creativecommons.org/licenses/publicdomain                        
httpclient         http://www.apache.org/licenses/LICENSE-2.0.html                         
httpclient-cache   http://www.apache.org/licenses/LICENSE-2.0.html                         
httpcore           http://www.apache.org/licenses/LICENSE-2.0.html                         
httpmime           http://www.apache.org/licenses/LICENSE-2.0.html                         
jackson-core       http://www.apache.org/licenses/LICENSE-2.0.html                         
jetty              http://www.apache.org/licenses/LICENSE-2.0.html                         
jquery             https://github.com/jquery/jquery/blob/master/MIT-LICENSE.txt            
jsonld             https://raw.githubusercontent.com/jsonld-java/jsonld-java/master/LICENCE
log4j              http://www.apache.org/licenses/LICENSE-2.0.html                         
lucene             http://www.apache.org/licenses/LICENSE-2.0.html                         
nanohttp           http://elonen.iki.fi/code/nanohttpd/#license                            
rexster-core       https://github.com/tinkerpop/rexster/blob/master/LICENSE.txt            
river              http://www.apache.org/licenses/LICENSE-2.0.html                         
semargl            https://github.com/levkhomich/semargl/blob/master/LICENSE               
servlet-api        http://www.apache.org/licenses/LICENSE-2.0.html                         
sesame             http://www.openrdf.org/download.jsp                                     
slf4j              http://www.slf4j.org/license.html                                       
zookeeper          http://www.apache.org/licenses/LICENSE-2.0.html                         

WARN : NanoSparqlServer.java:517: Starting NSS
WARN : WebAppContext.java:506: Failed startup of context o.e.j.w.WebAppContext@2db238ce{/blazegraph,file:/usr/share/blazegraph-2.1.4/war/,STARTING}{/usr/share/blazegraph/war/}
java.lang.RuntimeException: java.lang.RuntimeException: addr=-374049 : cause=com.bigdata.util.ChecksumError: offset=225590272,nbytes=426,expected=0,actual=27656193
    at com.bigdata.rdf.sail.webapp.BigdataRDFServletContextListener.openIndexManager(BigdataRDFServletContextListener.java:805)
    at com.bigdata.rdf.sail.webapp.BigdataRDFServletContextListener.contextInitialized(BigdataRDFServletContextListener.java:277)
    at org.eclipse.jetty.server.handler.ContextHandler.callContextInitialized(ContextHandler.java:798)
    at org.eclipse.jetty.servlet.ServletContextHandler.callContextInitialized(ServletContextHandler.java:444)
    at org.eclipse.jetty.server.handler.ContextHandler.startContext(ContextHandler.java:789)
    at org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:294)
    at org.eclipse.jetty.webapp.WebAppContext.startWebapp(WebAppContext.java:1341)
    at org.eclipse.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1334)
    at org.eclipse.jetty.server.handler.ContextHandler.doStart(ContextHandler.java:741)
    at org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:497)
    at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
    at org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:132)
    at org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:114)
    at org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:61)
    at org.eclipse.jetty.server.handler.ContextHandlerCollection.doStart(ContextHandlerCollection.java:163)
    at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
    at org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:132)
    at org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:114)
    at org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:61)
    at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
    at org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:132)
    at org.eclipse.jetty.server.Server.start(Server.java:387)
    at org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:114)
    at org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:61)
    at org.eclipse.jetty.server.Server.doStart(Server.java:354)
    at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
    at com.bigdata.rdf.sail.webapp.NanoSparqlServer.awaitServerStart(NanoSparqlServer.java:518)
    at com.bigdata.rdf.sail.webapp.NanoSparqlServer.main(NanoSparqlServer.java:482)
Caused by: java.lang.RuntimeException: addr=-374049 : cause=com.bigdata.util.ChecksumError: offset=225590272,nbytes=426,expected=0,actual=27656193
    at com.bigdata.rwstore.RWStore.getData(RWStore.java:2097)
    at com.bigdata.journal.RWStrategy.readFromLocalStore(RWStrategy.java:732)
    at com.bigdata.journal.RWStrategy.read(RWStrategy.java:155)
    at com.bigdata.journal.AbstractJournal._getCommitRecord(AbstractJournal.java:4601)
    at com.bigdata.journal.AbstractJournal.<init>(AbstractJournal.java:1328)
    at com.bigdata.journal.Journal.<init>(Journal.java:276)
    at com.bigdata.journal.Journal.<init>(Journal.java:269)
    at com.bigdata.rdf.sail.webapp.BigdataRDFServletContextListener.openIndexManager(BigdataRDFServletContextListener.java:799)
    ... 27 more
Caused by: com.bigdata.util.ChecksumError: offset=225590272,nbytes=426,expected=0,actual=27656193
    at com.bigdata.io.writecache.WriteCacheService._readFromLocalDiskIntoNewHeapByteBuffer(WriteCacheService.java:3783)
    at com.bigdata.io.writecache.WriteCacheService._getRecord(WriteCacheService.java:3598)
    at com.bigdata.io.writecache.WriteCacheService.access$2500(WriteCacheService.java:200)
    at com.bigdata.io.writecache.WriteCacheService$1.compute(WriteCacheService.java:3435)
    at com.bigdata.io.writecache.WriteCacheService$1.compute(WriteCacheService.java:3419)
    at com.bigdata.util.concurrent.Memoizer$1.call(Memoizer.java:77)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at com.bigdata.util.concurrent.Memoizer.compute(Memoizer.java:92)
    at com.bigdata.io.writecache.WriteCacheService.loadRecord(WriteCacheService.java:3540)
    at com.bigdata.io.writecache.WriteCacheService.read(WriteCacheService.java:3259)
    at com.bigdata.rwstore.RWStore.getData(RWStore.java:2088)
    ... 34 more
serviceURL: http://172.17.0.1:9999
thompsonbry commented 7 years ago

Checksum errors indicate bad data. All writes have checksums. It is trying to read the current checkpoiint record.

You might be able to open the store with the alternate rootblock. See com.bigdata.journal.Options for how to do this.

However, this very likely indicates a physical problem with the disk leading to a bad read back of the written data. This has been the root cause in previous reports of a checksum error. And that is precisely what the checksum is designed to detect.

Thanks, Bryan

On Feb 17, 2017 9:09 PM, "Darren L. Weber, Ph.D." notifications@github.com wrote:

Followed the instructions in the blazegraph-deb section to build and install 2.1.4 and it worked at first. But after trying to load a lot of data the system froze. On restart, blazegraph will not restart using sudo service blazegraph restart (although a following status indicates it is running OK). The log shows the failure to start is a com.bigdata.util.ChecksumError -- is this a bug or a feature?

INFO: com.bigdata.util.config.LogUtil: Configure and watch: /etc/blazegraph/log4j.properties

BlazeGraph(TM) Graph Engine

               Flexible
               Reliable
              Affordable
  Web-Scale Computing for the Enterprise

Copyright SYSTAP, LLC DBA Blazegraph 2006-2016. All rights reserved.

sul-dlweber-ubuntu Fri Feb 17 20:59:16 PST 2017 Linux/4.4.0-62-generic amd64 Intel(R) Core(TM) i7-3720QM CPU @ 2.60GHz Family 6 Model 58 Stepping 9, GenuineIntel #CPU=2 Oracle Corporation 1.7.0_80 freeMemory=123735856 buildVersion=2.1.4 gitCommit=738d05f08cffd319233a4bfbb0ec2a858e260f9c

Dependency License ICU http://source.icu-project.org/repos/icu/icu/trunk/license.html bigdata-ganglia http://www.apache.org/licenses/LICENSE-2.0.html blueprints-core https://github.com/tinkerpop/blueprints/blob/master/LICENSE.txt colt http://acs.lbl.gov/software/colt/license.html commons-codec http://www.apache.org/licenses/LICENSE-2.0.html commons-fileupload http://www.apache.org/licenses/LICENSE-2.0.html commons-io http://www.apache.org/licenses/LICENSE-2.0.html commons-logging http://www.apache.org/licenses/LICENSE-2.0.html dsiutils http://www.gnu.org/licenses/lgpl-2.1.html fastutil http://www.apache.org/licenses/LICENSE-2.0.html flot http://www.opensource.org/licenses/mit-license.php high-scale-lib http://creativecommons.org/licenses/publicdomain httpclient http://www.apache.org/licenses/LICENSE-2.0.html httpclient-cache http://www.apache.org/licenses/LICENSE-2.0.html httpcore http://www.apache.org/licenses/LICENSE-2.0.html httpmime http://www.apache.org/licenses/LICENSE-2.0.html jackson-core http://www.apache.org/licenses/LICENSE-2.0.html jetty http://www.apache.org/licenses/LICENSE-2.0.html jquery https://github.com/jquery/jquery/blob/master/MIT-LICENSE.txt jsonld https://raw.githubusercontent.com/jsonld-java/jsonld-java/master/LICENCE log4j https://raw.githubusercontent.com/jsonld-java/jsonld-java/master/LICENCElog4j http://www.apache.org/licenses/LICENSE-2.0.html lucene http://www.apache.org/licenses/LICENSE-2.0.html nanohttp http://elonen.iki.fi/code/nanohttpd/#license rexster-core https://github.com/tinkerpop/rexster/blob/master/LICENSE.txt river http://www.apache.org/licenses/LICENSE-2.0.html semargl https://github.com/levkhomich/semargl/blob/master/LICENSE servlet-api http://www.apache.org/licenses/LICENSE-2.0.html sesame http://www.openrdf.org/download.jsp slf4j http://www.slf4j.org/license.html zookeeper http://www.apache.org/licenses/LICENSE-2.0.html

WARN : NanoSparqlServer.java:517: Starting NSS WARN : WebAppContext.java:506: Failed startup of context o.e.j.w.WebAppContext@2db238ce{/blazegraph,file:/usr/share/blazegraph-2.1.4/war/,STARTING}{/usr/share/blazegraph/war/} java.lang.RuntimeException: java.lang.RuntimeException: addr=-374049 : cause=com.bigdata.util.ChecksumError: offset=225590272,nbytes=426,expected=0,actual=27656193 at com.bigdata.rdf.sail.webapp.BigdataRDFServletContextListener.openIndexManager(BigdataRDFServletContextListener.java:805) at com.bigdata.rdf.sail.webapp.BigdataRDFServletContextListener.contextInitialized(BigdataRDFServletContextListener.java:277) at org.eclipse.jetty.server.handler.ContextHandler.callContextInitialized(ContextHandler.java:798) at org.eclipse.jetty.servlet.ServletContextHandler.callContextInitialized(ServletContextHandler.java:444) at org.eclipse.jetty.server.handler.ContextHandler.startContext(ContextHandler.java:789) at org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:294) at org.eclipse.jetty.webapp.WebAppContext.startWebapp(WebAppContext.java:1341) at org.eclipse.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1334) at org.eclipse.jetty.server.handler.ContextHandler.doStart(ContextHandler.java:741) at org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:497) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68) at org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:132) at org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:114) at org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:61) at org.eclipse.jetty.server.handler.ContextHandlerCollection.doStart(ContextHandlerCollection.java:163) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68) at org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:132) at org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:114) at org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:61) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68) at org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:132) at org.eclipse.jetty.server.Server.start(Server.java:387) at org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:114) at org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:61) at org.eclipse.jetty.server.Server.doStart(Server.java:354) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68) at com.bigdata.rdf.sail.webapp.NanoSparqlServer.awaitServerStart(NanoSparqlServer.java:518) at com.bigdata.rdf.sail.webapp.NanoSparqlServer.main(NanoSparqlServer.java:482) Caused by: java.lang.RuntimeException: addr=-374049 : cause=com.bigdata.util.ChecksumError: offset=225590272,nbytes=426,expected=0,actual=27656193 at com.bigdata.rwstore.RWStore.getData(RWStore.java:2097) at com.bigdata.journal.RWStrategy.readFromLocalStore(RWStrategy.java:732) at com.bigdata.journal.RWStrategy.read(RWStrategy.java:155) at com.bigdata.journal.AbstractJournal._getCommitRecord(AbstractJournal.java:4601) at com.bigdata.journal.AbstractJournal.(AbstractJournal.java:1328) at com.bigdata.journal.Journal.(Journal.java:276) at com.bigdata.journal.Journal.(Journal.java:269) at com.bigdata.rdf.sail.webapp.BigdataRDFServletContextListener.openIndexManager(BigdataRDFServletContextListener.java:799) ... 27 more Caused by: com.bigdata.util.ChecksumError: offset=225590272,nbytes=426,expected=0,actual=27656193 at com.bigdata.io.writecache.WriteCacheService._readFromLocalDiskIntoNewHeapByteBuffer(WriteCacheService.java:3783) at com.bigdata.io.writecache.WriteCacheService._getRecord(WriteCacheService.java:3598) at com.bigdata.io.writecache.WriteCacheService.access$2500(WriteCacheService.java:200) at com.bigdata.io.writecache.WriteCacheService$1.compute(WriteCacheService.java:3435) at com.bigdata.io.writecache.WriteCacheService$1.compute(WriteCacheService.java:3419) at com.bigdata.util.concurrent.Memoizer$1.call(Memoizer.java:77) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at com.bigdata.util.concurrent.Memoizer.compute(Memoizer.java:92) at com.bigdata.io.writecache.WriteCacheService.loadRecord(WriteCacheService.java:3540) at com.bigdata.io.writecache.WriteCacheService.read(WriteCacheService.java:3259) at com.bigdata.rwstore.RWStore.getData(RWStore.java:2088) ... 34 more serviceURL: http://172.17.0.1:9999

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/blazegraph/database/issues/52, or mute the thread https://github.com/notifications/unsubscribe-auth/ACdv4C3P4frA2WuhwRi4qafbRqJuOSW1ks5rdn0igaJpZM4ME_24 .

dazza-codes commented 7 years ago

[Updated ... ]

Thanks, Bryan.

This installation is running on Ubuntu 16.04 linux as a Virtualbox guest on a MacBook Pro (circa 2013).

Apologies for a couple of noob questions:

$ cat /etc/default/blazegraph
NAME=blazegraph
BLZG_HOME=/usr/share/${NAME}
BLZG_CONF=/etc/blazegraph
BLZG_LOG=/var/log/${NAME}
BLZG_DATA=/var/lib/${NAME}
JOURNAL_FILE=blazegraph.jnl
JOURNAL="${BLZG_DATA}"/"${JOURNAL_FILE}"
# Run Blazegraph as this user ID and group ID
BLZG_USER=blzg
BLZG_GROUP=blzg
JETTY_XML="${BLZG_CONF}"/jetty.xml
JETTY_RESOURCE_BASE="${BLZG_HOME}"/war/
JETTY_PORT=9999
LOGGING_CONFIG="${BLZG_CONF}"/logging.properties
LOG4J_CONFIG="${BLZG_CONF}"/log4j.properties
NSS="com.bigdata.rdf.sail.webapp.NanoSparqlServer"
NSS_NAMESPACE="kb"
NSS_PROPERTIES="${BLZG_CONF}"/RWStore.properties
JVM_OPTS="-Djava.awt.headless=true -server -Xmx8g -XX:MaxDirectMemorySize=3000m -XX:+UseG1GC"
#Used for testing on EC2 micro instances
#JVM_OPTS="-Djava.awt.headless=true -server -Xmx256m -XX:MaxDirectMemorySize=100m -XX:+UseG1GC"

Thanks, Darren

PS: As a blazegraph noob, I started reading the wiki site, but soon came to the conclusion that a lot of that information is difficult because it is terse or assumes too much background knowledge, or it is out of date; often last updated in 2015.

beebs-systap commented 7 years ago

Darren,

The configuration will be in /etc/blazegraph/. The data by default is in /var/lib/blazegraph/, though this may be configured by editing the /etc/default/blazegraph file.

See also https://github.com/blazegraph/database/blob/master/blazegraph-deb/README.md.

Can you post some more details on how you were loading the files? You may want to consider trying the REST Bulk Load.

In general, the Wiki is a living document and updated as new features are added, etc.

thompsonbry commented 7 years ago

https://blazegraph.github.io/database/apidocs/com/bigdata/journal/Options.html

On Sat, Feb 18, 2017 at 9:17 AM, Darren L. Weber, Ph.D. < notifications@github.com> wrote:

Thanks, Bryan.

This installation is running on Ubuntu 16.04 linux as a Virtualbox guest on a MacBook Pro (circa 2013).

Apologies for a couple of noob questions:

  • Where do I find current information about com.bigdata.journal.Options?

    • A quick search of this repo using github search didn't pinpoint it.
    • ack-grep 'journal.Options' /usr/share/blazegraph-2.1.4/ -- nothing.
  • Where is the KB store on the file system (Ubuntu linux) and how is that configured?

Thanks, Darren

PS: As a blazegraph noob, I started reading the wiki site, but soon came to the conclusion that a lot of that information is difficult because it is terse or assumes too much background knowledge, or it is out of date; often last updated in 2015.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/blazegraph/database/issues/52#issuecomment-280859857, or mute the thread https://github.com/notifications/unsubscribe-auth/ACdv4EchfwxI5d3muFSrYwzSk6i1asUuks5rdyeqgaJpZM4ME_24 .

dazza-codes commented 7 years ago

I used dpkg -L blazegraph to find more installation details, including the /usr/bin/loadRestAPI.sh script. I read about that on the wiki and was able to get it working on a previous installation, but could not find it on the debian installation (until just now). (BTW, also just found the example deployment code in src/resources/deployment and that is interesting, although we want puppet recipes.) When I tried to use a similar REST API script, I did not entirely know where to find the configs and property files but seemed to find relevant files, but it was failing to load data, so I switched to a SPARQL Update approach that seemed to be working OK for a while (it's not entirely a surprise that SPARQL Update may not be an optimal way to load about 32,000 small RDF files without disabling some features).

thompsonbry commented 7 years ago

Look at https://wiki.blazegraph.com/wiki/index.php/REST_API#Bulk_Data_Load for loading many small files (you can point it at a directory).

Bryan

On Sat, Feb 18, 2017 at 9:40 AM, Darren L. Weber, Ph.D. < notifications@github.com> wrote:

I used dpkg -L blazegraph to find more installation details, including the /usr/bin/loadRestAPI.sh script. I read about that on the wiki and was able to get it working on a previous installation, but could not find it on the debian installation (until just now). (BTW, also just found the example deployment code in src/resources/deployment and that is interesting, although we want puppet recipes.) When I tried to use a similar REST API script, I did not entirely know where to find the configs and property files but seemed to find relevant files, but it was failing to load data, so I switched to a SPARQL Update approach that seemed to be working OK for a while (it's not entirely a surprise that SPARQL Update may not be an optimal way to load about 32,000 small RDF files without disabling some features).

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/blazegraph/database/issues/52#issuecomment-280861602, or mute the thread https://github.com/notifications/unsubscribe-auth/ACdv4BIS8BX07c72TSP78yNldl-PLVwiks5rdy0FgaJpZM4ME_24 .

dazza-codes commented 7 years ago

I may now understand these values from the /etc/default/blazegraph configs:

NAME=blazegraph
BLZG_DATA=/var/lib/${NAME}
JOURNAL_FILE=blazegraph.jnl
JOURNAL="${BLZG_DATA}"/"${JOURNAL_FILE}"

They seem to result in this file:

$ ls -l /var/lib/blazegraph/
total 212M
-rw-r--r-- 1 blzg blzg 295M Feb 17 16:59 blazegraph.jnl

(BTW, coming from prior experience with 4store, I'm surprised that all the KBs are in one big journal file. But I guess that's how it is with Blazegraph. When I create a new namespace, it still goes into this same journal file.) Now I want to experiment with moving that (corrupt?) journal file and restarting blazegraph, in the hope that it will recreate a new one when it starts up. I don't care about trashing any data that I've already loaded, but I expect this drastic move will prompt blazegraph to 'reset' and I'll lose namespaces etc.

beebs-systap commented 7 years ago

In that case, try stopping the service, deleting the blazegraph.jnl, and then restarting. It will create a new journal with your specified options when the service starts.

dazza-codes commented 7 years ago

WOW, so many options in https://blazegraph.github.io/database/apidocs/com/bigdata/journal/Options.html -- a bit over my head when it comes to my knowledge of file system optimizations. Are there any experiences or recommendations for these options when running blazegraph on a linux virtualbox guest on a Mac OSX host? (The VM has 8Gb RAM and might need to bump that up because I see the configs set the java heap at about 8g by default.)

dazza-codes commented 7 years ago

BTW, and this comment is specific to this initial startup problem, the service blazegraph status was entirely ignorant of the checksum failure - despite the failure to start, that service status reported that blazegraph was up and running.

dazza-codes commented 7 years ago

FYI, the following got the service back up:

sudo -i
ls -lh /var/lib/blazegraph/
service blazegraph status  # check it is stopped or use `stop`
mv /var/lib/blazegraph/blazegraph.jnl  /var/lib/blazegraph/blazegraph.bak
ls -lh /var/lib/blazegraph/
service blazegraph start
service blazegraph status # it is up, but don't trust this, check the log
ls -lh /var/lib/blazegraph/  # it recreated the journal file
tail -n200 /var/log/blazegraph/blazegraph.log # log looks OK, no check sum errors
beebs-systap commented 7 years ago

Great. You likely want to make sure that you allow plenty of memory for the VM to cache the disk access. If you have 8GB in total, try running with 2G for the JVM heap. It you're going to work any at scale, you'll likely want to increase the RAM to your VM and run the Blazegraph process with 4G or 8G. It's possible that your first load failed due to an OOME as the JVM used all of the VM memory causing the corruption.

thompsonbry commented 7 years ago

I would check your disk for errors. There are probably bad sectors if you got a checksum error. Bryan

On Sat, Feb 18, 2017 at 10:06 AM, Darren L. Weber, Ph.D. < notifications@github.com> wrote:

FYI, the following got the service back up:

sudo -i ls -lh /var/lib/blazegraph/ service blazegraph status # check it is stopped or use stop mv /var/lib/blazegraph/blazegraph.jnl /var/lib/blazegraph/blazegraph.bak ls -lh /var/lib/blazegraph/ service blazegraph start service blazegraph status # it is up, but don't trust this, check the log ls -lh /var/lib/blazegraph/ # it recreated the journal file tail -n200 /var/log/blazegraph/blazegraph.log # log looks OK, no check sum errors

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/blazegraph/database/issues/52#issuecomment-280863707, or mute the thread https://github.com/notifications/unsubscribe-auth/ACdv4DwypBSd7mtyNUW5WKSfMJnl3vz1ks5rdzMpgaJpZM4ME_24 .

dazza-codes commented 7 years ago

Tried to change the disk location for the KB store by setting

# /etc/default/blazegraph
NAME=blazegraph  # default
BLZG_DATA=/data/${NAME}  # changed only this
JOURNAL_FILE=blazegraph.jnl  # default
JOURNAL="${BLZG_DATA}"/"${JOURNAL_FILE}"  # default

This process looked like this

sudo -i
service blazegraph stop
# edit /etc/default/blazegraph as above
mkdir /data/blazegraph
chown blzg:blzg /data/blazegraph
service blazegraph start
ls -l /data/blazegraph/  # huh?  no journal file, what's up?  The logs look OK, geez.  Clueless.
dazza-codes commented 7 years ago

Going to try using touch /forcefsck and reboot the system to find/fix corruption on this virtualbox vm

dazza-codes commented 7 years ago

Got the system running again, so we can close this issue. If you want, you might create a separate issue to fix the service blazegraph status when the checksum fails to start? (The log only WARNS but this seems like a failure to start the service.)

khteh commented 5 years ago

sudo systemctl enable blazegraph sudo systemctl start blazegraph