ifesdjeen / cascading-cassandra

Modern Cassandra tap for Cascading. Actually works with Cascading 2.0, Cascalog 1.10 and supports CQL collections.
http://clojurecassandra.info
45 stars 19 forks source link

Trying to Configure SourceTap #21

Open tadanki opened 10 years ago

tadanki commented 10 years ago

Hi,

I am trying to configure a SourceTap from Cassandra to source data from a column family.

I found that the docs are not in sync with the code; I am unclear on the expected configurations for the settings(Map<String, Object>). I keep running into:

java.lang.RuntimeException: no config type specs for key: types

I have tried setting the properties that the code seems to look for : types.dynamic, source.columns(seperated by ':') etc. but to not much luck

Any sample code here would help.

Thanks guys, Karthik

ifesdjeen commented 10 years ago

Could you put up a full stack trace and complete code configuration?

tadanki commented 10 years ago

Here is what i use to construct a Tap:

props.put("db.columnFamily" , "cassandra_table_name");
props.put("mappings.cqlKeys", Arrays.asList("cassandra_col_1:Cascading_Field_1","cassandra_col_2:Cascading_Field_2"));
props.put("mappigs.cqlValues", Arrays.asList("cassandra_col_3:Cascading_Field_3"));
props.put("source.columns", "cassandra_col_1:CascadingField_1,cassandra_col_2:CascadingField_2,cassandra_col_3:CascadingField_3");

return new CassandraTap(new CassandraCQL3Scheme(props));

I am however using the Tap for as a sink just fine.

The error I am grappling with is :

Exception in thread "main" cascading.flow.planner.PlannerException: could not build flow from assembly: [no config type specs for key: types]
    at cascading.flow.planner.FlowPlanner.handleExceptionDuringPlanning(FlowPlanner.java:576)
    at cascading.flow.hadoop.planner.HadoopPlanner.buildFlow(HadoopPlanner.java:263)
    at cascading.flow.hadoop.planner.HadoopPlanner.buildFlow(HadoopPlanner.java:80)
    at cascading.flow.FlowConnector.connect(FlowConnector.java:459)
    at com.pearson.ltg.daalt.athena.analytics.cascading.workflows.WorkflowOrchestrator.runAll(WorkflowOrchestrator.java:170)
    at com.pearson.ltg.daalt.athena.analytics.Main.main(Main.java:37)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:187)
Caused by: java.lang.RuntimeException: no config type specs for key: types
    at com.ifesdjeen.cascading.cassandra.SettingsHelper.getTypesByKey(SettingsHelper.java:11)
    at com.ifesdjeen.cascading.cassandra.SettingsHelper.getTypes(SettingsHelper.java:18)
    at com.ifesdjeen.cascading.cassandra.sources.CqlSource.configure(CqlSource.java:21)
    at com.ifesdjeen.cascading.cassandra.cql3.CassandraCQL3Scheme.sourceConfInit(CassandraCQL3Scheme.java:75)
    at com.ifesdjeen.cascading.cassandra.cql3.CassandraCQL3Scheme.sourceConfInit(CassandraCQL3Scheme.java:27)
    at cascading.tap.Tap.sourceConfInit(Tap.java:181)
    at cascading.flow.hadoop.HadoopFlowStep.initFromSources(HadoopFlowStep.java:330)
    at cascading.flow.hadoop.HadoopFlowStep.getInitializedConfig(HadoopFlowStep.java:99)
    at cascading.flow.hadoop.HadoopFlowStep.createFlowStepJob(HadoopFlowStep.java:201)
    at cascading.flow.hadoop.HadoopFlowStep.createFlowStepJob(HadoopFlowStep.java:69)
    at cascading.flow.planner.BaseFlowStep.getFlowStepJob(BaseFlowStep.java:676)
    at cascading.flow.BaseFlow.initializeNewJobsMap(BaseFlow.java:1184)
    at cascading.flow.BaseFlow.initialize(BaseFlow.java:199)
    at cascading.flow.hadoop.planner.HadoopPlanner.buildFlow(HadoopPlanner.java:257)

This is because Cascading cant seem to talk to the Tap to get field information out of it to plan the workflow. So, I need to understand how to construct a Tap as a Source.

Thanks for your help ifesdjeen, I could help with fixing the Docs and posting some more examples after I have this working.

ifesdjeen commented 10 years ago

Problem is that you still have to specify types:

"types" {"id"      "UTF8Type"
              "version" "Int32Type"
              "date"    "DateType"
              "count"   "DecimalType"}

In Java it would look pretty much like that:

// Put mappings of types, specifying which source field has which type
Map<String, String> types = new HashMap<>();
types.put("name",      "UTF8Type");
types.put("language",  "UTF8Type");
types.put("schmotes",  "Int32Type");
types.put("votes",     "Int32Type");
config.put("types", types);

props.put("types", types);
ifesdjeen commented 10 years ago

I really really appreciate if you could add a little README section about CQL3 :) i mean - if my comment have helped :)