brianmhess / cassandra-loader

Delimited file loader for Cassandra
Apache License 2.0
197 stars 93 forks source link

File encoding problem #50

Closed onesuper closed 7 years ago

onesuper commented 8 years ago

Hi

In my case, the loader finally used 'ANSI_X3.4-1968' because I happened to fail to set the right locale on my machine. I guess the loader will not assume the file encoding and inherit it from system property.

Since Cassandra assumes text is UTF-8 encoded string, it will be nicer if the loader can read files assuming it is UTF-8 encoded.

Thank you onesuper

brianmhess commented 8 years ago

The only way to set the Charset is at startup time in the JVM via command-line flags, such as -Dfile.encoding=UTF8 You could add this yourself by doing: java -Dfile.encoding=UTF8 -jar cassandra-loader ... Are you suggesting that we add the following to the flags we add when making the cassandra-loader executable? -Dfile.encoding=UTF8