Stratio / deep-spark

Connecting Apache Spark with different data stores [DEPRECATED]
http://stratio.github.io/deep-spark
Apache License 2.0
197 stars 42 forks source link

Problems with uppercase in keyspace and table name #6

Closed ernestobv closed 10 years ago

ernestobv commented 10 years ago

Hello,

We have a keyspace with the name Keyspace1 and a table with the name Standard1, and I get exceptions when run some test over Stratio.

The following code (from thw class com.stratio.examples.JavaExample.) is the way we create a IDeepJobConfig

// Configuration and initialization
IDeepJobConfig config = DeepJobConfigFactory.create()
        .host(cassandraHost).rpcPort(cassandraPort)
        .keyspace(keyspaceName).table(tableName)
        .username(userName).password(password)
        .inputColumns("key")
        .initialize();

I have try to initialize the keyspaceName and tableName varaibles in two ways. The first one is this:

String keyspaceName = "Keyspace1";
String tableName = "Standard1";

When it is run, the exception thrown is:

Exception in thread "main" com.datastax.driver.core.exceptions.InvalidQueryException: Keyspace 'keyspace1' does not exist
        at com.datastax.driver.core.exceptions.InvalidQueryException.copy(InvalidQueryException.java:35)
        at com.datastax.driver.core.DefaultResultSetFuture.extractCauseFromExecutionException(DefaultResultSetFuture.java:256)
        at com.datastax.driver.core.SessionManager.setKeyspace(SessionManager.java:336)
        at com.datastax.driver.core.Cluster.connect(Cluster.java:228)
        at com.stratio.deep.config.GenericDeepJobConfig.getSession(GenericDeepJobConfig.java:151)
        at com.stratio.deep.config.GenericDeepJobConfig.fetchTableMetadata(GenericDeepJobConfig.java:194)
        at com.stratio.deep.config.GenericDeepJobConfig.validate(GenericDeepJobConfig.java:547)
        at com.stratio.deep.config.GenericDeepJobConfig.initialize(GenericDeepJobConfig.java:447)
        at com.stratio.examples.JavaExample.main(JavaExample.java:91)
Caused by: com.datastax.driver.core.exceptions.InvalidQueryException: Keyspace 'keyspace1' does not exist
        at com.datastax.driver.core.Responses$Error.asException(Responses.java:97)
        at com.datastax.driver.core.DefaultResultSetFuture.onSet(DefaultResultSetFuture.java:108)
        at com.datastax.driver.core.RequestHandler.setFinalResult(RequestHandler.java:235)
        at com.datastax.driver.core.RequestHandler.onSet(RequestHandler.java:367)
        at com.datastax.driver.core.Connection$Dispatcher.messageReceived(Connection.java:571)
        at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
        at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
        at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
        at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
        at org.jboss.netty.handler.codec.oneone.OneToOneDecoder.handleUpstream(OneToOneDecoder.java:70)
        at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
        at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
        at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
        at org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:462)
        at org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:443)
        at org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
        at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
        at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
        at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
        at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
        at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
        at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
        at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:109)
        at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312)
        at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:90)
        at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
        at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
        at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:744)

And the second way is using double quotes:

String keyspaceName = "\"Keyspace1\"";
String tableName = "\"Standard1\""; 

The exception thrown is:

com.stratio.deep.exception.DeepIOException: com.datastax.driver.core.exceptions.InvalidQueryException: unconfigured columnfamily "Standard1"
    at com.stratio.deep.cql.DeepRecordReader$RowIterator.executeQuery(DeepRecordReader.java:601)
    at com.stratio.deep.cql.DeepRecordReader$RowIterator.<init>(DeepRecordReader.java:191)
    at com.stratio.deep.cql.DeepRecordReader.initialize(DeepRecordReader.java:121)
    at com.stratio.deep.cql.DeepRecordReader.<init>(DeepRecordReader.java:92)
    at com.stratio.deep.rdd.CassandraRDD.initRecordReader(CassandraRDD.java:275)
    at com.stratio.deep.rdd.CassandraRDD.compute(CassandraRDD.java:188)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:241)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:232)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:109)
    at org.apache.spark.scheduler.Task.run(Task.scala:53)
    at org.apache.spark.executor.Executor$TaskRunner$$anonfun$run$1.apply$mcV$sp(Executor.scala:211)
    at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:42)
    at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:41)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
    at org.apache.spark.deploy.SparkHadoopUtil.runAsUser(SparkHadoopUtil.scala:41)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:176)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:744)
Caused by: com.datastax.driver.core.exceptions.InvalidQueryException: unconfigured columnfamily "Standard1"
    at com.datastax.driver.core.exceptions.InvalidQueryException.copy(InvalidQueryException.java:35)
    at com.datastax.driver.core.DefaultResultSetFuture.extractCauseFromExecutionException(DefaultResultSetFuture.java:256)
    at com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(DefaultResultSetFuture.java:172)
    at com.datastax.driver.core.SessionManager.execute(SessionManager.java:92)
    at com.datastax.driver.core.SessionManager.execute(SessionManager.java:88)
    at com.stratio.deep.cql.DeepRecordReader$RowIterator.executeQuery(DeepRecordReader.java:582)
    ... 20 more
Caused by: com.datastax.driver.core.exceptions.InvalidQueryException: unconfigured columnfamily "Standard1"
    at com.datastax.driver.core.Responses$Error.asException(Responses.java:97)
    at com.datastax.driver.core.DefaultResultSetFuture.onSet(DefaultResultSetFuture.java:108)
    at com.datastax.driver.core.RequestHandler.setFinalResult(RequestHandler.java:235)
    at com.datastax.driver.core.RequestHandler.onSet(RequestHandler.java:367)
    at com.datastax.driver.core.Connection$Dispatcher.messageReceived(Connection.java:571)
    at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
    at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
    at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
    at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
    at org.jboss.netty.handler.codec.oneone.OneToOneDecoder.handleUpstream(OneToOneDecoder.java:70)
    at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
    at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
    at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
    at org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:462)
    at org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:443)
    at org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
    at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
    at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
    at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
    at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
    at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
    at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
    at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:109)
    at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312)
    at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:90)
    at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
    at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
    at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
    ... 3 more

It seems that the problem comes from DataStax Java Driver, but I don't known if there is a way to use the DataStax Java Driver from Stratio-Deep code to avoid these errors.

On the other hand, are there any restrictions or problems when using uppercase and lowercase in Cassandra with Stratio?

Thanks, Ernesto.

lucarosellini commented 10 years ago

Hi Ernesto, I've managed to reproduce the bug, it seems datastax driver always treat keyspaces and column family names as lowercase. In Deep we actually handle column families escaping, but still do not escape keyspace name. I have to look a little bit further into this. For the time being you can manually escape the keyspace name (but not the CF name).

Example:

String keyspaceName = "\"Keyspace1\"";
String tableName = "Standard1";

By the way I'll try to fix this issue asap.

Thanks, Luca

ernestobv commented 10 years ago

Hi Luca,

Thanks, it works.

Regards, Ernesto.