datasalt / splout-db

A web-latency SQL spout for Hadoop.
50 stars 14 forks source link

deploy who to define myself encoding : for examlpe utf-8 #43

Closed suolemen closed 9 years ago

suolemen commented 9 years ago

commend : hadoop jar splout-hadoop-0.3.0-hadoop.jar deploy -root out-hive-simple -ts tmp_sqlout -q http://..._:4412

or

hadoop jar splout-hadoop-0.3.0-hadoop.jar simple-generate -libjars parquet-cascading.jar,parquet-column.jar,parquet-common.jar,parquet-encoding.jar,parquet-format.jar,parquet-generator.jar,parquet-hadoop-bundle.jar,parquet-hadoop.jar,parquet-hive-1.2.8.jar,parquet-hive-binding-bundle-1.4.1.jar,parquet-hive-storage-handler-1.4.1.jar,parquet-scrooge.jar,parquet-thrift.jar -it HIVE -hdb default -htn liyg_not_parquet -o out-hive-simple -pby areacode -p 2 -t inventory_liyg_test333_of_me -tb tmp_sqlout

who to difine encoding : UTF-8 ?

pereferrera commented 9 years ago

Hello suolemen,

What problem do you have exactly with encoding? Is your generated database not properly encoded?

You can explicitly define the encoding of a database using PRAGMA commands from SQLite. To do that you need to use the "Generator" (not the "Simple generator") and use initialStatements option to pass a PRAGMA command like PRAGMA encoding = "UTF-8"; (see http://sploutsql.com/user_guide.html).

Encoding problems are sometimes hard to debug. To make sure that the encoding problem is inside SQLite, you can open one of the generated binary files (e.g. 0.db) with "sqlite" command. If the database is properly encoding, then it probably means the encoding problem is somewhere between the web client and the web server (the QNodes).