Closed humbao closed 7 years ago
@humbao where do you get those JSON fields from? there is no JSON column type, so I'm wondering if this is the optimal route to take here.
Sorry I was unlear. I meant JSON inserts. Currently the code implements the field definition by the -schema flag which defines a fixed field list which corresponds exactly to a delimited data source.
Implementing the JSON insert would allow for flexible "schemaless" import ability.
Thus each line in the data source is a complete JSON object which also defines the field list and also allow for more complicated structures using the other collection data types.
Refer to: http://www.datastax.com/dev/blog/whats-new-in-cassandra-2-2-json-support http://cassandra.apache.org/doc/cql3/CQL-2.2.html#insertJson
This doesn't have to be a part of this codebase(although much of the existing structure can be applied), it could be a derivative or standalone.
@humbao JSON import makes sense. (as a side note, I don't think it actually needs to be anymore line-based)
What you are suggesting here is loading JSON data - basically, a JSON parser followed by CQL insert. This project is more about delimited file loading. While I think the code could support both, the parsing bit is different. -schema is there for 2 reasons. First, it identifies the destination for the data (note that there is no -table or -keyspace option). Second, it lays out the order of the columns in the delimited file. While you wouldn't need the second, you'd need to do something about the first. I'd be happy to consider pulling that sort of thing into this code, but would need to think about how you specify things. Would it be a -format option (delimited versus JSON versus something else)? And each -format option would have different options for it (e.g., -schema for delimited, -table/-keyspace for JSON, etc).
JSON was added in v0.0.21.
Can we have JSON fields to be inserted?