fullcontact / hadoop-sstable

Splittable Input Format for Reading Cassandra SSTables Directly
Apache License 2.0
49 stars 14 forks source link

IllegalArgumentException - java.nio.Buffer.limit(Buffer.java:267) #9

Closed gadodia closed 9 years ago

gadodia commented 9 years ago

java.lang.IllegalArgumentException at java.nio.Buffer.limit(Buffer.java:267) at org.apache.cassandra.db.marshal.AbstractCompositeType.getBytes(AbstractCompositeType.java:51) at org.apache.cassandra.db.marshal.AbstractCompositeType.getWithShortLength(AbstractCompositeType.java:60) at org.apache.cassandra.db.marshal.AbstractCompositeType.getString(AbstractCompositeType.java:226) at com.fullcontact.sstable.example.SimpleExampleMapper.map(SimpleExampleMapper.java:42) at com.fullcontact.sstable.example.SimpleExampleMapper.map(SimpleExampleMapper.java:1) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212) 14/11/04 15:16:16 INFO mapred.JobClient: Job complete: job_local_0001 14/11/04 15:16:16 INFO mapred.JobClient: Counters: 0 14/11/04 15:16:16 INFO example.SimpleExample: Total runtime: 2s

Xorlev commented 9 years ago

Generally this happens when your types in the mapper disagree with your actual types.

What does your CREATE TABLE CQL statement look like?

gadodia commented 9 years ago

thats what is happening. But i don'k know whats wrong with this create statement.

CREATE TABLE spyro.user_conversations (user_id text,created_at timestamp,updated_at timestamp,body text,subject text,conversation_type text,ypid bigint,collection_id text,status text,recipients set,announcement_id text,started_by_user_id text,PRIMARY KEY (user_id,created_at))

Xorlev commented 9 years ago

In SimpleExampleMapper change:

private final AbstractType keyType =
            CompositeType.getInstance(Lists.<AbstractType<?>>newArrayList(UTF8Type.instance, UTF8Type.instance));

to:

private final AbstractType keyType = UTF8Type.instance;

And give it a shot again.

FWIW it's something we'll be fixing fairly soon -- given that you have to pass in the create statement to the job it might as well do it for you.

gadodia commented 9 years ago

Yeah I already tried that before. It was throwing an exception. I re-ran it. Following is the exception

org.apache.cassandra.db.marshal.MarshalException: invalid UTF8 bytes 0000014958e527c8 at org.apache.cassandra.db.marshal.UTF8Type.getString(UTF8Type.java:54) at org.apache.cassandra.db.marshal.AbstractCompositeType.getString(AbstractCompositeType.java:228) at com.fullcontact.sstable.example.JsonColumnParser.getColumnName(JsonColumnParser.java:68) at com.fullcontact.sstable.example.JsonColumnParser.serializeColumns(JsonColumnParser.java:103) at com.fullcontact.sstable.example.JsonColumnParser.getJson(JsonColumnParser.java:57) at com.fullcontact.sstable.example.SimpleExampleMapper.map(SimpleExampleMapper.java:44) at com.fullcontact.sstable.example.SimpleExampleMapper.map(SimpleExampleMapper.java:1) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)

Xorlev commented 9 years ago

@gadodia I've just made a PR on this repo which should fix your issue. However, to get it running, the other option would be to change columnNameConvertor in JsonColumnParser to LongType.instance instead of the CompositeType.

gadodia commented 9 years ago

Okay. Thank you for the effort and prompt response.