JanusGraph / janusgraph

JanusGraph: an open-source, distributed graph database
https://janusgraph.org
Other
5.27k stars 1.16k forks source link

Add testing of Bigtable support via Bigtable emulator #415

Open mbrukman opened 7 years ago

mbrukman commented 7 years ago

PR https://github.com/JanusGraph/janusgraph/issues/103 added support for Google Cloud Bigtable via the HBase backend, but the tests are only being run using HBase. We should add tests to run against the Bigtable emulator.

@lindsayismith, I believe you ran the HBase test suite manually with the Bigtable emulator when you made the changes in PR #103; would you mind adding an automated way to run it so that they run every time with Travis CI to ensure that everything continues to work properly with the Bigtable backend?

It would also be good to verify that any new changes, as well as new client library versions, continue to work well with JanusGraph + Bigtable, and this would be the way to do it.

burukuru commented 7 years ago

If I was to set up our own tests against Janusgraph/Bigtable for Grakn, would I just need to set the following settings with Bigtable emulator running on localhost?

storage.backend=hbase
storage.hbase.ext.hbase.client.connection.impl=com.google.cloud.bigtable.hbase1_0.BigtableConnection
storage.hostname=localhost

Anything else I might be missing?

mbrukman commented 7 years ago

@burukuru — you need to run the Bigtable emulator and set the appropriate environment variables, as per the instructions:

> gcloud beta emulators bigtable start
> $(gcloud beta emulators bigtable env-init)

which will be automatically picked up by the Bigtable client library when run in the context of JanusGraph, which is specified via the storage.hbase.ext.hbase.client.connection.impln parameter.

I don't think you should set the storage.hostname as it should be picked up from the environment variables (it also needs the port).

burukuru commented 7 years ago

Thanks @mbrukman. It's working with those commands and without storage.hostname defined but it seems to be be asking for project.id and instance.id.

Caused by: java.lang.IllegalArgumentException: Project ID must be supplied via google.bigtable.project.id
        at com.google.bigtable.repackaged.com.google.common.base.Preconditions.checkArgument(Preconditions.java:122)
        at com.google.cloud.bigtable.hbase.BigtableOptionsFactory.getValue(BigtableOptionsFactory.java:283)
        at com.google.cloud.bigtable.hbase.BigtableOptionsFactory.fromConfiguration(BigtableOptionsFactory.java:246)
        at org.apache.hadoop.hbase.client.AbstractBigtableConnection.<init>(AbstractBigtableConnection.java:129)
        at com.google.cloud.bigtable.hbase1_0.BigtableConnection.<init>(BigtableConnection.java:56)

I just set some dummy values and Grakn is accessing the Bigtable emulator now.

ghost commented 4 years ago

I'm getting the following error for with JanusGraph 0.3 Docker with the BigTable emulator:

4328 [main] INFO  org.janusgraph.diskstorage.util.BackendOperation  - Temporary exception during backend operation [setConfiguration]. Attempting backoff retry.
org.janusgraph.diskstorage.TemporaryBackendException: Temporary failure in storage backend
    at org.janusgraph.diskstorage.hbase.HBaseStoreManager.mutateMany(HBaseStoreManager.java:459)
    at org.janusgraph.diskstorage.hbase.HBaseKeyColumnValueStore.mutateMany(HBaseKeyColumnValueStore.java:209)
    at org.janusgraph.diskstorage.hbase.HBaseKeyColumnValueStore.mutate(HBaseKeyColumnValueStore.java:104)
    at org.janusgraph.diskstorage.configuration.backend.KCVSConfiguration$2.call(KCVSConfiguration.java:152)
    at org.janusgraph.diskstorage.configuration.backend.KCVSConfiguration$2.call(KCVSConfiguration.java:147)
    at org.janusgraph.diskstorage.util.BackendOperation.execute(BackendOperation.java:147)
    at org.janusgraph.diskstorage.util.BackendOperation$1.call(BackendOperation.java:161)
    at org.janusgraph.diskstorage.util.BackendOperation.executeDirect(BackendOperation.java:68)
    at org.janusgraph.diskstorage.util.BackendOperation.execute(BackendOperation.java:54)
    at org.janusgraph.diskstorage.util.BackendOperation.execute(BackendOperation.java:158)
    at org.janusgraph.diskstorage.configuration.backend.KCVSConfiguration.set(KCVSConfiguration.java:147)
    at org.janusgraph.diskstorage.configuration.backend.KCVSConfiguration.set(KCVSConfiguration.java:124)
    at org.janusgraph.diskstorage.configuration.ModifiableConfiguration.set(ModifiableConfiguration.java:43)
    at org.janusgraph.diskstorage.configuration.ModifiableConfiguration.setAll(ModifiableConfiguration.java:50)
    at org.janusgraph.diskstorage.configuration.builder.ReadConfigurationBuilder.buildGlobalConfiguration(ReadConfigurationBuilder.java:72)
    at org.janusgraph.graphdb.configuration.builder.GraphDatabaseConfigurationBuilder.build(GraphDatabaseConfigurationBuilder.java:53)
    at org.janusgraph.core.JanusGraphFactory.open(JanusGraphFactory.java:161)
    at org.janusgraph.core.JanusGraphFactory.open(JanusGraphFactory.java:132)
    at org.janusgraph.core.JanusGraphFactory.open(JanusGraphFactory.java:112)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.tinkerpop.gremlin.structure.util.GraphFactory.open(GraphFactory.java:78)
    at org.apache.tinkerpop.gremlin.structure.util.GraphFactory.open(GraphFactory.java:70)
    at org.apache.tinkerpop.gremlin.structure.util.GraphFactory.open(GraphFactory.java:104)
    at org.apache.tinkerpop.gremlin.server.util.DefaultGraphManager.lambda$new$0(DefaultGraphManager.java:57)
    at java.util.LinkedHashMap$LinkedEntrySet.forEach(LinkedHashMap.java:671)
    at org.apache.tinkerpop.gremlin.server.util.DefaultGraphManager.<init>(DefaultGraphManager.java:55)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at org.apache.tinkerpop.gremlin.server.util.ServerGremlinExecutor.<init>(ServerGremlinExecutor.java:80)
    at org.apache.tinkerpop.gremlin.server.GremlinServer.<init>(GremlinServer.java:120)
    at org.apache.tinkerpop.gremlin.server.GremlinServer.<init>(GremlinServer.java:84)
    at org.apache.tinkerpop.gremlin.server.GremlinServer.main(GremlinServer.java:343)
Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 action: StatusRuntimeException: 1 time,
    at com.google.cloud.bigtable.hbase.BatchExecutor.batchCallback(BatchExecutor.java:315)
    at com.google.cloud.bigtable.hbase.BatchExecutor.batch(BatchExecutor.java:240)
    at com.google.cloud.bigtable.hbase.AbstractBigtableTable.batch(AbstractBigtableTable.java:187)
    at org.janusgraph.diskstorage.hbase.HTable1_0.batch(HTable1_0.java:51)
    at org.janusgraph.diskstorage.hbase.HBaseStoreManager.mutateMany(HBaseStoreManager.java:454)
    ... 36 more
siddharthTyagi commented 3 years ago

Is there a chance of this being a part of any future milestone?

FlorianHockmann commented 1 year ago

The problem here is simply that we lack contributors who have knowledge of Bigtable. Anyone willing to contribute is more than welcome to tackle this PR. It would be great to have automatic testing in place for Bigtable. That would not only give us more confidence that changes we make to the HBase storage adapter do not accidentally break the Bigtable storage adapter (which uses the same code), but it would also allow us to update to newer client library versions of Bigtable.

tfcace commented 9 months ago

@FlorianHockmann I'm willing to give this a shot. I'm not a Java developer, so would need some help with the ecosystem and understanding how to set this up. but - Internally we have a service that communicates w/ JanusGraph over germlinpython, and our test suite runs against the Bigtable Emulator.

li-boxuan commented 9 months ago

@tfcace That sounds awesome! If you run into any problems, feel free to join JanusGraph discord channel and we can discuss there. HbaseGraphTest.java is probably a good place to start. Since hbase has a test container available, running hbase integration tests are fairly straightforward - just start an HBaseContainer and override the method getConfiguration() to provide hbase's endpoints. It seems there's a bigtable emulator testcontainer available (https://github.com/datalbry/testcontainer-bigtable) but I am not sure if that would work out-of-the-box or not.