brianmhess / cassandra-loader

Delimited file loader for Cassandra
Apache License 2.0
197 stars 93 forks source link

Loader NPEs in com.datastax.driver.core.BoundStatement.bind #18

Closed KenMcK1 closed 6 years ago

KenMcK1 commented 8 years ago

Something going wrong with binding values. It's hard to be more specific as there's nothing in the loader log files and no indication of which row caused the problem. The table contains three text fields and one map<text, text>. There are some entries in the map which contain a name and no value -- along the lines of "{abc:,def:}".

Exception in thread "main" java.util.concurrent.ExecutionException: java.lang.NullPointerException
    at java.util.concurrent.FutureTask.report(FutureTask.java:122)
    at java.util.concurrent.FutureTask.get(FutureTask.java:188)
    at com.datastax.loader.CqlDelimLoad.run(CqlDelimLoad.java:558)
    at com.datastax.loader.CqlDelimLoad.main(CqlDelimLoad.java:599)
Caused by: java.lang.NullPointerException
    at com.datastax.driver.core.BoundStatement.bind(BoundStatement.java:191)
    at com.datastax.driver.core.DefaultPreparedStatement.bind(DefaultPreparedStatement.java:103)
    at com.datastax.loader.CqlDelimLoadTask.execute(CqlDelimLoadTask.java:231)
    at com.datastax.loader.CqlDelimLoadTask.call(CqlDelimLoadTask.java:145)
    at com.datastax.loader.CqlDelimLoadTask.call(CqlDelimLoadTask.java:69)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
al3xandru commented 8 years ago

I think the issue here is that Cassandra doesn't accept null as values for a map. The driver could be more careful with this situation (see https://github.com/datastax/java-driver/blob/2.1.6/driver-core/src/main/java/com/datastax/driver/core/BoundStatement.java#L191) and reject the value. On the other hand, I'm not really sure what the loader could do in this case. Reject the record? Clean it up?

brianmhess commented 8 years ago

This is a good question. What behavior is desired? Should it be deemed a bad parse and put in the BADPARSE file?

al3xandru commented 8 years ago

Cleaning up the value will silently alter the value which might carry meaning, so my vote would be to reject the row and place it in the BADPARSE file.