yugabyte / yugabyte-db

YugabyteDB - the cloud native distributed SQL database for mission-critical applications.
https://www.yugabyte.com
Other
8.99k stars 1.07k forks source link

JanusGraph to Support Json objects on YugabyteDB #8036

Closed vpratheek007 closed 1 year ago

vpratheek007 commented 3 years ago

Jira Link: DB-1626 Hi Team,

I am using Janusgrph to interact with YCQL, but inside the database we have jason objects and seems it is not supported.

gremlin> graph = JanusGraphFactory.open('conf/janusgraph-cql.properties') 11:51:02 ERROR com.datastax.driver.core.SchemaParser - Error parsing schema for table yugapoc.jsonemployee: Cluster.getMetadata().getKeyspace("yugapoc").getTable("jsonemployee") will be missing or incomplete com.datastax.driver.core.exceptions.UnresolvedUserTypeException: Cannot resolve user type yugapoc.jsonb

Regards Pratheek

Here is summary of solution to this issue:

Currently there are 3 jars which need to be deployed manually:

-rw-r--r--. 1 tedyu ybdev 1180221 Apr 19 17:53 cassandra-driver-core-3.8.0-yb-7.jar -rw-r--r--. 1 tedyu ybdev 127772 Apr 19 17:52 janusgraph-cassandra-0.5.4-SNAPSHOT.jar -rw-r--r--. 1 tedyu ybdev 52531 Apr 19 17:51 janusgraph-cql-0.5.4-SNAPSHOT.jar

janusgraph-cassandra-0.5.4-SNAPSHOT.jar replaces janusgraph-cassandra-0.5.3.jar janusgraph-cql-0.5.4-SNAPSHOT.jar replaces janusgraph-cql-0.5.3.jar

tedyu commented 3 years ago

In 4.6.0-yb-x branch of cassandra-java-driver, I don't seem to find UnresolvedUserTypeException

It seems you're using 3.8.0-yb-x release.

Can you attach janusgraph-cql.properties (and describe schema involving the jsonb column) ?

Thanks

vpratheek007 commented 3 years ago

Please find the requested details

cat janusgraph-cql.properties
# Copyright 2019 JanusGraph Authors
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#      http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# JanusGraph configuration sample: Cassandra over a socket
#
# This file connects to a Cassandra daemon running on localhost via
# Thrift.  Cassandra must already be started before starting JanusGraph
# with this file.

# The implementation of graph factory that will be used by gremlin server
#
# Default:    org.janusgraph.core.JanusGraphFactory
# Data Type:  String
# Mutability: LOCAL
gremlin.graph=org.janusgraph.core.JanusGraphFactory

# The primary persistence provider used by JanusGraph.  This is required.
# It should be set one of JanusGraph's built-in shorthand names for its
# standard storage backends (shorthands: berkeleyje, cassandrathrift,
# cassandra, astyanax, embeddedcassandra, cql, hbase, inmemory) or to the
# full package and classname of a custom/third-party StoreManager
# implementation.
#
# Default:    (no default value)
# Data Type:  String
# Mutability: LOCAL
storage.backend=cql

# The hostname or comma-separated list of hostnames of storage backend
# servers.  This is only applicable to some storage backends, such as
# cassandra and hbase.
#
# Default:    127.0.0.1
# Data Type:  class java.lang.String[]
# Mutability: LOCAL
storage.hostname=10.182.185.42

# The name of JanusGraph's keyspace.  It will be created if it does not
# exist.
#
# Default:    janusgraph
# Data Type:  String
# Mutability: LOCAL
storage.cql.keyspace=janusgraph

# Whether to enable JanusGraph's database-level cache, which is shared
# across all transactions. Enabling this option speeds up traversals by
# holding hot graph elements in memory, but also increases the likelihood
# of reading stale data.  Disabling it forces each transaction to
# independently fetch graph elements from storage before reading/writing
# them.
#
# Default:    false
# Data Type:  Boolean
# Mutability: MASKABLE
cache.db-cache = true

# How long, in milliseconds, database-level cache will keep entries after
# flushing them.  This option is only useful on distributed storage
# backends that are capable of acknowledging writes without necessarily
# making them immediately visible.
#
# Default:    50
# Data Type:  Integer
# Mutability: GLOBAL_OFFLINE
#
# Settings with mutability GLOBAL_OFFLINE are centrally managed in
# JanusGraph's storage backend.  After starting the database for the first
# time, this file's copy of this setting is ignored.  Use JanusGraph's
# Management System to read or modify this value after bootstrapping.
cache.db-cache-clean-wait = 20

# Default expiration time, in milliseconds, for entries in the
# database-level cache. Entries are evicted when they reach this age even
# if the cache has room to spare. Set to 0 to disable expiration (cache
# entries live forever or until memory pressure triggers eviction when set
# to 0).
#
# Default:    10000
# Data Type:  Long
# Mutability: GLOBAL_OFFLINE
#
# Settings with mutability GLOBAL_OFFLINE are centrally managed in
# JanusGraph's storage backend.  After starting the database for the first
# time, this file's copy of this setting is ignored.  Use JanusGraph's
# Management System to read or modify this value after bootstrapping.
cache.db-cache-time = 180000

# Size of JanusGraph's database level cache.  Values between 0 and 1 are
# interpreted as a percentage of VM heap, while larger values are
# interpreted as an absolute size in bytes.
#
# Default:    0.3
# Data Type:  Double
# Mutability: MASKABLE
cache.db-cache-size = 0.5
[root@pratyuga13 conf]#

Jason tables:

ycqlsh:yugapoc> desc streamlinerrequest

CREATE TABLE yugapoc.streamlinerrequest (
    id int PRIMARY KEY,
    streamlinerdetails jsonb
) WITH default_time_to_live = 0
    AND transactions = {'enabled': 'false'};

ycqlsh:yugapoc> desc jsonemployee

CREATE TABLE yugapoc.jsonemployee (
    empid int PRIMARY KEY,
    empdetails jsonb
) WITH default_time_to_live = 0
    AND transactions = {'enabled': 'false'};

Keyspace:

ycqlsh> desc yugapoc

CREATE KEYSPACE yugapoc WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '3'}  AND durable_writes = true;

CREATE TABLE yugapoc.testproc (
    orderid int PRIMARY KEY,
    dpsnum text,
    ordernum int,
    customercode text,
    customername text,
    adress text,
    description text,
    quantity text,
    purchaseitem text,
    totalcost text,
    invoicenumber text
) WITH default_time_to_live = 0
    AND transactions = {'enabled': 'false'};

CREATE TABLE yugapoc.orderdetails (
    orderid int PRIMARY KEY,
    dpsnum text,
    ordernum int,
    customercode text,
    customername text,
    adress text,
    description text,
    quantity text,
    purchaseitem text,
    totalcost text,
    invoicenumber text
) WITH default_time_to_live = 0
    AND transactions = {'enabled': 'false'};

CREATE TABLE yugapoc.returnitemdetails (
    ordernum text PRIMARY KEY,
    servicetagnum text,
    reason text
) WITH default_time_to_live = 0
    AND transactions = {'enabled': 'false'};

CREATE TABLE yugapoc.streamlinerrequest (
    id int PRIMARY KEY,
    streamlinerdetails jsonb
) WITH default_time_to_live = 0
    AND transactions = {'enabled': 'false'};

CREATE TABLE yugapoc.orderdetailsrequest (
    id int PRIMARY KEY,
    orderdetails jsonb
) WITH default_time_to_live = 0
    AND transactions = {'enabled': 'false'};

CREATE TABLE yugapoc.creorderhistory (
    id text PRIMARY KEY,
    buid text,
    country text,
    currency text,
    ordernumber text,
    reasoncode text,
    reasondescription text,
    refundflag text
) WITH default_time_to_live = 0
    AND transactions = {'enabled': 'false'};

CREATE TABLE yugapoc.accounts (
    account_name text,
    account_type text,
    balance double,
    PRIMARY KEY (account_name, account_type)
) WITH CLUSTERING ORDER BY (account_type ASC)
    AND default_time_to_live = 0
    AND transactions = {'enabled': 'true'};

CREATE TABLE yugapoc.jsonemployee (
    empid int PRIMARY KEY,
    empdetails jsonb
) WITH default_time_to_live = 0
    AND transactions = {'enabled': 'false'};

CREATE TABLE yugapoc.employee (
    id int PRIMARY KEY,
    name text,
    age int,
    language text
) WITH default_time_to_live = 0
    AND transactions = {'enabled': 'false'};

CREATE TABLE yugapoc.testaccounts (
    account_name text,
    account_type text,
    balance double,
    PRIMARY KEY (account_name, account_type)
) WITH CLUSTERING ORDER BY (account_type ASC)
    AND default_time_to_live = 0
    AND transactions = {'enabled': 'true'};

CREATE TABLE yugapoc.employee_1 (
    id int PRIMARY KEY,
    name text,
    age int,
    language text
) WITH default_time_to_live = 0
    AND transactions = {'enabled': 'false'};

CREATE TABLE yugapoc.ticketdetails (
    ordernum text PRIMARY KEY,
    dpsnum text,
    descriptiontype text,
    desciption text,
    errorsource text,
    customerissue text,
    processtype text,
    creditdellonly text,
    creditlineitems text,
    status text,
    istransactioncompleted boolean
) WITH default_time_to_live = 0
    AND transactions = {'enabled': 'false'};

CREATE TABLE yugapoc.credpshistory (
    id text PRIMARY KEY,
    dpsstatus text,
    type text,
    subtype text,
    creditmemonumber text,
    dpscreatedate text,
    dispatchnumber text,
    disputeamount text,
    processtype text,
    omegacnnumber text,
    submitter text,
    dpsremarks text,
    creditnotedisputeamount text,
    creditnoteadjustedamount text,
    creditnoteremarks text,
    orderrefference text
) WITH default_time_to_live = 0
    AND transactions = {'enabled': 'false'};

CREATE TABLE yugapoc.autokey (
    id uuid PRIMARY KEY,
    orderno text,
    dispatchno text
) WITH default_time_to_live = 0
    AND transactions = {'enabled': 'false'};

CREATE TABLE yugapoc.orderitemdetails (
    orderid text PRIMARY KEY,
    orderitemid text,
    items text,
    servicetagnum text,
    description text
) WITH default_time_to_live = 0
    AND transactions = {'enabled': 'false'};
tedyu commented 3 years ago

You should reference YB java driver:

diff --git a/janusgraph-cql/pom.xml b/janusgraph-cql/pom.xml
index 4d0d3f0e4..d678836d0 100644
--- a/janusgraph-cql/pom.xml
+++ b/janusgraph-cql/pom.xml
@@ -114,7 +114,7 @@
         </dependency>

         <dependency>
-            <groupId>com.datastax.oss</groupId>
+            <groupId>com.yugabyte</groupId>
             <artifactId>java-driver-core</artifactId>
             <version>${cassandra-driver.version}</version>
             <exclusions>
@@ -137,7 +137,7 @@
             </exclusions>
         </dependency>
         <dependency>
-            <groupId>com.datastax.oss</groupId>
+            <groupId>com.yugabyte</groupId>
             <artifactId>java-driver-query-builder</artifactId>
             <version>${cassandra-driver.version}</version>
             <exclusions>
diff --git a/pom.xml b/pom.xml
index 7f48c0ad2..2e3e736c1 100644
--- a/pom.xml
+++ b/pom.xml
@@ -114,7 +114,7 @@
         <test.excluded.groups>MEMORY_TESTS,PERFORMANCE_TESTS,BRITTLE_TESTS</test.excluded.groups>
         <dependency.locations.enabled>false</dependency.locations.enabled>
         <cassandra.version>3.11.10</cassandra.version>
-        <cassandra-driver.version>4.11.0</cassandra-driver.version>
+        <cassandra-driver.version>4.6.0-yb-6</cassandra-driver.version>
         <testcontainers.version>1.15.2</testcontainers.version>
         <easymock.version>4.2</easymock.version>
         <protobuf.version>3.15.6</protobuf.version>
vpratheek007 commented 3 years ago

Can you please detail the steps.

pwd /u01/janusgraph-0.5.3/examples

-rw-rw-r-- 1 yugabyte yugabyte 2380 Dec 25 13:38 pom.xml

how to make this change

vpratheek007 commented 3 years ago

In which location this pom.xml needs to be updated, please find the locate output from the host

[root@pratyuga13 conf]# locate pom.xml /u01/csndra/dse-6.7.7/resources/solr/web/solr/META-INF/maven/org.apache.solr/solr-web/pom.xml /u01/janusgraph-0.2.0-hadoop2/examples/pom.xml /u01/janusgraph-0.2.0-hadoop2/examples/example-berkeleyje/pom.xml /u01/janusgraph-0.2.0-hadoop2/examples/example-cassandra/pom.xml /u01/janusgraph-0.2.0-hadoop2/examples/example-common/pom.xml /u01/janusgraph-0.2.0-hadoop2/examples/example-cql/pom.xml /u01/janusgraph-0.2.0-hadoop2/examples/example-hbase/pom.xml /u01/janusgraph-0.2.0-hadoop2/examples/example-remotegraph/pom.xml /u01/janusgraph-0.2.0-hadoop2/examples/example-tinkergraph/pom.xml /u01/janusgraph-0.5.3/examples/pom.xml /u01/janusgraph-0.5.3/examples/example-berkeleyje/pom.xml /u01/janusgraph-0.5.3/examples/example-cassandra/pom.xml /u01/janusgraph-0.5.3/examples/example-common/pom.xml /u01/janusgraph-0.5.3/examples/example-cql/pom.xml /u01/janusgraph-0.5.3/examples/example-hbase/pom.xml /u01/janusgraph-0.5.3/examples/example-remotegraph/pom.xml /u01/janusgraph-0.5.3/examples/example-tinkergraph/pom.xml

tedyu commented 3 years ago

Here is the patch: janus-yb.txt

which is based on master branch of janusgraph git repo.

You can apply it on the release you're using.

tedyu commented 3 years ago

Please note the above patch is to unblock you. There will be a new Cassandra driver release which introduces DefaultDriverOption.METRICS_NODE_EXPIRE_AFTER

vpratheek007 commented 3 years ago

Here is the patch: janus-yb.txt

which is based on master branch of janusgraph git repo.

You can apply it on the release you're using.

Can you please details out the steps, as int he janus-yb.txt file you had used diff command against two pom.xml files, so what should i do now to add the fix, should i need to replace any of the exisitng pom file..?

vpratheek007 commented 3 years ago

Can we have a call to discuss the fix

tedyu commented 3 years ago

Suppose you're using https://github.com/JanusGraph/janusgraph/releases/tag/v0.5.3

You can obtain the source code here: https://github.com/JanusGraph/janusgraph/archive/refs/tags/v0.5.3.zip

Apply the patch with patch command:

patch -p1 < janus-yb.txt

Then rebuild 0.5.3 release.

Please follow the above procedure first. We can have a call if needed.

vpratheek007 commented 3 years ago

Lets have a call would be better

vpratheek007 commented 3 years ago

Tedyu how can we have a call discussion, can you give me your number

tedyu commented 3 years ago

janusgraph-cassandra-pom-xml.txt pom-xml.txt janusgraph-cql-pom-xml.txt

Turns out for janusgraph-0.5.3, three pom.xml files need to be modified.

janusgraph-cql/pom.xml, janusgraph-cassandra/pom.xml and pom.xml

I renamed the files due to limitation of github attachment.

tedyu commented 3 years ago

I am available for a live session. I am in San Jose. What's your timezone and when would be a good time for you ?

tedyu commented 3 years ago

0.5.3.txt

Here is the patch which can be applied to v0.5 branch of janusgraph

vpratheek007 commented 3 years ago

I am working from India timezone, let me know your availability morning your time so i can be available for the working session.

vpratheek007 commented 3 years ago

I have extracted the https://github.com/JanusGraph/janusgraph/archive/refs/tags/v0.5.3.zip and then,

-rwxrwxr-x 1 yugabyte yugabyte 2652 Apr 19 01:12 janus-yb.txt [yugabyte@pratyuga13 janusgraph-0.5.3]$ patch -p1 < janus-yb.txt -bash: patch: command not found

Better to have a working session am not sure how to apply this patch.

ddorian commented 3 years ago

@vpratheek007

Do a google search on how to install patch in <insert your operating system here> and retry the command again.

tedyu commented 3 years ago

Since the changes are small, you can manually edit the pom.xml files. However, you need to install mvn when building .

Here are the two relevant jars built with AdoptOpenJDK java 1.8.0_282

janusgraph-cassandra-0.5.4-SNAPSHOT.txt janusgraph-cql-0.5.4-SNAPSHOT.txt

I need to rename the jar -> txt so that github permits the upload. You should rename them janusgraph-cassandra-0.5.4-SNAPSHOT.jar and janusgraph-cql-0.5.4-SNAPSHOT.jar, respectively. Replace the janusgraph-cassandra-0.5.3.jar and janusgraph-cql-0.5.3.jar with these files, respectively.

Keep a copy of janusgraph-cassandra-0.5.3.jar and janusgraph-cql-0.5.3.jar

vpratheek007 commented 3 years ago

Ted would be feasible to start a working session, so we can close it quickly

vpratheek007 commented 3 years ago

Ok let me try to replace these two

janusgraph-cassandra-0.5.4-SNAPSHOT.txt janusgraph-cql-0.5.4-SNAPSHOT.txt

renaming to

janusgraph-cassandra-0.5.4-SNAPSHOT.jar janusgraph-cql-0.5.4-SNAPSHOT.jar

vpratheek007 commented 3 years ago

Hmm, still same error

09:05:34 ERROR com.datastax.driver.core.SchemaParser - Error parsing schema for table yugapoc.jsonemployee: Cluster.getMetadata().getKeyspace("yugapoc").getTable("jsonemployee") will be missing or incomplete

I had replaced the with the jar provided here

-rw-rw-r-- 1 yugabyte yugabyte 52513 Dec 25 13:50 janusgraph-cql-0.5.3.jar_bak -rw-rw-r-- 1 yugabyte yugabyte 127744 Dec 25 13:50 janusgraph-cassandra-0.5.3.jar_bak -rw-rw-r-- 1 yugabyte yugabyte 252515 Dec 25 13:51 janusgraph-hbase-0.5.3.jar -rw-rw-r-- 1 yugabyte yugabyte 1752 Dec 25 13:51 janusgraph-bigtable-0.5.3.jar -rw-rw-r-- 1 yugabyte yugabyte 90288 Dec 25 13:51 janusgraph-es-0.5.3.jar -rw-rw-r-- 1 yugabyte yugabyte 35287 Dec 25 13:51 janusgraph-lucene-0.5.3.jar -rw-rw-r-- 1 yugabyte yugabyte 37092 Dec 25 13:52 janusgraph-solr-0.5.3.jar -rw-rw-r-- 1 yugabyte yugabyte 3094 Dec 25 13:52 janusgraph-all-0.5.3.jar -rwxrwxr-x 1 yugabyte yugabyte 127772 Apr 19 08:58 janusgraph-cassandra-0.5.4-SNAPSHOT.jar -rwxrwxr-x 1 yugabyte yugabyte 52531 Apr 19 08:58 janusgraph-cql-0.5.4-SNAPSHOT.jar [root@pratyuga13 lib]# pwd /u01/janusgraph-0.5.3/lib

tedyu commented 3 years ago

There should be more error around 'Error parsing schema', can you paste the complete stack trace ? Can you give the complete command line ?

I am available for live session. What's your preferred meeting software ?

thanks

tedyu commented 3 years ago

It would be better if you build 0.5 branch with my patch and collect 0.5.4-SNAPSHOT.jar files into its own directory. You can launch the application referencing that directory.

ls ~/j-lib/
janusgraph-all-0.5.4-SNAPSHOT.jar       janusgraph-cassandra-0.5.4-SNAPSHOT.jar  janusgraph-es-0.5.4-SNAPSHOT.jar      janusgraph-solr-0.5.4-SNAPSHOT.jar
janusgraph-bigtable-0.5.4-SNAPSHOT.jar  janusgraph-cql-0.5.4-SNAPSHOT.jar        janusgraph-lucene-0.5.4-SNAPSHOT.jar

Do you see

com.datastax.driver.core.exceptions.UnresolvedUserTypeException: Cannot resolve user type yugapoc.jsonb

in the latest run ?

tedyu commented 3 years ago

You would also need this jar since it is referenced by janusgraph: cassandra-driver-core-3.8.0-yb-7.txt

You can automate dependency resolution if your application pom references cassandra driver 3.8.0-yb-7

vpratheek007 commented 3 years ago

Ted are you available for a working session now..?

vpratheek007 commented 3 years ago

for faster resolution..?

tedyu commented 3 years ago

Yes - what's the meeting URL ?

You can send the URL to zyu@yugabyte.com

vpratheek007 commented 3 years ago

do you have teams..?

vpratheek007 commented 3 years ago

Microsoft Teams meeting Join on your computer or mobile app Click here to join the meeting Or call in (audio only) +91 22 6259 0316,,,,88354928# India, Mumbai Phone Conference ID: 883 549 28# Find a local number | Reset PIN Learn More | Meeting options

vpratheek007 commented 3 years ago

sent to your emil-ID

tedyu commented 3 years ago

Summary: the error on jsonemployee table is gone. Facing the following:

gremlin> graph = JanusGraphFactory.open('conf/janusgraph-cql.properties')

Need to set configuration value: root.graph.storage-version

Type ':help' or ':h' for help.

Display stack trace? [yN]N
vpratheek007 commented 3 years ago

Thanks Ted, after the replacement of jar now this is the error am getting,

[root@pratyuga13 u01]# cd janusgraph-0.5.3 [root@pratyuga13 janusgraph-0.5.3]# ./bin/gremlin.sh

     \,,,/
     (o o)

-----oOOo-(3)-oOOo----- SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/u01/janusgraph-0.5.3/lib/slf4j-log4j12-1.7.12.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/u01/janusgraph-0.5.3/lib/logback-classic-1.1.3.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] plugin activated: tinkerpop.server plugin activated: tinkerpop.tinkergraph 23:48:49 WARN org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable plugin activated: tinkerpop.hadoop plugin activated: tinkerpop.spark plugin activated: tinkerpop.utilities plugin activated: janusgraph.imports gremlin> graph = JanusGraphFactory.open('conf/janusgraph-cql.properties') Need to set configuration value: root.graph.storage-version Type ':help' or ':h' for help. Display stack trace? [yN]N gremlin>

tedyu commented 3 years ago

Update:

by adding 'graph.allow-upgrade=true' to the properties file, the above error is gone.

ashetkar commented 1 year ago

Closing the issue as per the last comment.