fullcontact / hadoop-sstable

Splittable Input Format for Reading Cassandra SSTables Directly
Apache License 2.0
49 stars 14 forks source link

Not working with column of type Set<String> #14

Open gadodia opened 9 years ago

gadodia commented 9 years ago

When it is parsing the column of type set, it is returning the name appended with colon (":"). And its not able to parse that column data and returns empty. Do you have any idea about it? It does able to recognize it as of type org.apache.cassandra.db.marshal.SetType(org.apache.cassandra.db.marshal.UTF8Type) but unable to do the getString thing.

bvanberg commented 9 years ago

Are you able to provide the full stacktrace?

gadodia commented 9 years ago

There isnt any error to trace the stack. Let me debug it and get you where its going weird.

gadodia commented 9 years ago

I looked into it. I have a column of type Set in cassandra. So the data is been stored as ["vrids:12345678","","ts"],["vrids:123456782342","","ts"], ["vrids:1223423345678","","ts"], ....... When is tries to parse it, it assumes "vrids:12345678" as composite key and extract 12345678 as the key and tries to find it in the CFMetaData and since thats not the key , it fails. After that it keeps vrids:12345678 as key and null as value which results in empty string. Whereas 12345678 should be the value and vrids should the key. hope this helps you to understand what is going wrong. Seems to be like it takes every column as composite type and tries to resolve it. Whereas it should consider it as a collectionType.