thinkaurelius / titan

Distributed Graph Database
http://titandb.io
Apache License 2.0
5.24k stars 1.01k forks source link

Investigate some bulk load issues on TP3 + titan09 snapshots #849

Closed dalaro closed 9 years ago

dalaro commented 10 years ago

Concurrent with #848, I tested a Cassandra bulk load of grateful-dead-vertices.gio into Titan using BulkLoaderVertexProgram on 37b3a45a2fe440c1aa2c845ae1f0fb49c5ce0ef8 and tinkerpop/tinkerpop3@7a79fc1b531d2871dffcaf241de712bbfb116efe (still manually defining schema like we did in previous tests). It mostly worked. After the load, I connected with TitanFactory.open and the vertex and edge counts matched expectation.

I hedged with "mostly" because of three complications:

  1. I had to manually copy slf4j jars to get the ZK service forked by Giraph to stop dying immediately due to ClassNotFoundException on Logger. I had HADOOP_GREMLIN_LIBS=ext/titan-core:ext/hadoop-gremlin. slf4j is in neither by default, but it is in lib. Maybe I need to add lib? Not sure yet.
  2. After the Giraph job saved vertices, the process oddly went into 200% CPU (i.e. 2 cores fully utilized) and sat there for about 10 minutes. I attached a JVMTI profiler and the VM promptly crashed.
  3. (TitanGraph).V().count() and the same for edges returned the appropriate values, but (TitanGraph).V().map() threw NPE. Same for edges.

I'm not sure which of these -- if any -- warrant TP3 issues yet (they could just be configuration problems). I'm putting them here as a reminder to gather more info.

okram commented 10 years ago

Dan -- (TitanGraph).V().map() is bad Gremlin3. That will definately throw an NPE. You really want g.V().valueMap(). Map is a function from S->E and if you don't provide a closure, then its a null pointer.

dalaro commented 10 years ago

Oops. Thanks for pointing that out! Old habits die hard...

dalaro commented 9 years ago

These items are covered or moot:

  1. Still a pain when Titan is installed as a plugin on a preexisting TP3 install, but resolved within Titan's all-inclusive distribution zipfile
  2. This turned out to be an insane classpath ordering problem. In the version of TP3's gremlin.sh on which I reported this item, gremlin.sh used filesystem dirent ordering for classpath elements. If titan-all appeared before Hadoop artifacts, this would happen. I hacked around this by forcing alpha order. The underlying potential for classpath conflicts remains, but is also partially mitigated within Titan's own distribution, where we can use Maven's dependency convergence rule (this is not practical with the plugin model).
  3. This was my own operator error, as Marko pointed out.