vesoft-inc / nebula-exchange

NebulaGraph Exchange is an Apache Spark application to parse data from different sources to NebulaGraph in a distributed environment. It supports both batch and streaming data in various formats and sources including other Graph Databases, RDBMS, Data warehouses, NoSQL, Message Bus, File systems, etc.
Apache License 2.0
28 stars 36 forks source link

support hierarchical json format in json datasource #95

Open xiajingchun opened 2 years ago

xiajingchun commented 2 years ago

Today, json datasource only supports "flat" format, e.g., each line of the json file should be like:

 {
"key1:"value1, 
"key2:"value2",
...
}

In some case, the json file could use hierarchical structure, the value of a key is a child-object or even an array of objects, e.g.

{
"key1":
       [
           {"key11":"value11", "key12":"value12"},
           {"key21":"value21", "key22":"value22"},
           ...
       ]
}

Can we add support for this?

xiajingchun commented 2 years ago

Regarding the array, it's better to support a mapping like below: tag.prop1 <-> json["key1"][0]["key11"] tag.prop2 <-> json["key1"][1]["key22"] It's up to the user how to design the json object.