basho / cuttlefish

never lose your childlike sense of wonder baby cuttlefish, promise me?
Apache License 2.0
205 stars 124 forks source link

Test: UTF-8 vs latin-1 regression #140

Closed joedevivo closed 10 years ago

joedevivo commented 10 years ago

We need to determine wether or not Riak 1.4's app.config and vm.args file could accept UTF-8 values or if they are restricted to latin-1. For example, multi backend bucket names.

If riak 1.4 can accept UTF-8 values, cuttlefish needs to be able to as well. If they can't , then it's desirable for cuttlefish to be able to detect non latin1 files and print an error message, but that might be a 2.0.1 fix.

joedevivo commented 10 years ago

Tried the following settings for platform_data_dir in riak 1.4.8

./dataŒ
./dataŸ

which parsed fine, but created the following directories:

drwxr-xr-x   3 joe  staff   102 Apr  2 13:18 dataÅ?
drwxr-xr-x   3 joe  staff   102 Apr  2 13:18 dataŸ

which seems bad.

as for vm.args,

## Name of the riak node
-name riakŒ@127.0.0.1

and the node won't even start:

➜  riak-1.4.8  ./bin/riak console
config is OK
Exec: /Users/joe/Downloads/riak-1.4.8/bin/../erts-5.9.1/bin/erlexec -boot /Users/joe/Downloads/riak-1.4.8/bin/../releases/1.4.8/riak              -config /Users/joe/Downloads/riak-1.4.8/bin/../etc/app.config             -pa /Users/joe/Downloads/riak-1.4.8/bin/../lib/basho-patches             -args_file /Users/joe/Downloads/riak-1.4.8/bin/../etc/vm.args -- console
Root: /Users/joe/Downloads/riak-1.4.8/bin/..
{error_logger,{{2014,4,2},{13,24,58}},"Invalid node name: ~p~n",['riak?\222@127.0.0.1']}
{error_logger,{{2014,4,2},{13,24,58}},crash_report,[[{initial_call,{net_kernel,init,['Argument__1']}},{pid,<0.20.0>},{registered_name,[]},{error_info,{exit,{error,badarg},[{gen_server,init_it,6,[{file,"gen_server.erl"},{line,320}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,227}]}]}},{ancestors,[net_sup,kernel_sup,<0.10.0>]},{messages,[]},{links,[<0.17.0>]},{dictionary,[{longnames,true}]},{trap_exit,true},{status,running},{heap_size,987},{stack_size,24},{reductions,518}],[]]}
{error_logger,{{2014,4,2},{13,24,58}},supervisor_report,[{supervisor,{local,net_sup}},{errorContext,start_error},{reason,{'EXIT',nodistribution}},{offender,[{pid,undefined},{name,net_kernel},{mfargs,{net_kernel,start_link,[['riak?\222@127.0.0.1',longnames]]}},{restart_type,permanent},{shutdown,2000},{child_type,worker}]}]}
{error_logger,{{2014,4,2},{13,24,58}},supervisor_report,[{supervisor,{local,kernel_sup}},{errorContext,start_error},{reason,shutdown},{offender,[{pid,undefined},{name,net_sup},{mfargs,{erl_distribution,start_link,[]}},{restart_type,permanent},{shutdown,infinity},{child_type,supervisor}]}]}
{error_logger,{{2014,4,2},{13,24,58}},std_info,[{application,kernel},{exited,{shutdown,{kernel,start,[normal,[]]}}},{type,permanent}]}
{"Kernel pid terminated",application_controller,"{application_start_failure,kernel,{shutdown,{kernel,start,[normal,[]]}}}"}

Crash dump was written to: ./log/erl_crash.dump
Kernel pid terminated (application_controller) ({application_start_failure,kernel,{shutdown,{kernel,start,[normal,[]]}}})
joedevivo commented 10 years ago
## Cookie for distributed erlang.  All nodes in the same cluster
## should use the same cookie or they will not be able to communicate.
-setcookie riakŒ

(riak@127.0.0.1)1> erlang:get_cookie().
'riakÅ\222'

Looks like UTF-8 that are outside of latin-1 are supported in the erlang cookie, and I can join two nodes with the above cookie with no problem.

joedevivo commented 10 years ago

Here's a list of characters that seem good for testing this:

The CP1252 characters that are not part of ANSI/ISO 8859-1, and that should therefore always be encoded as Unicode characters greater than 255, are the following:

 Windows   Unicode    Char.
  char.   HTML code   test         Description of Character
  -----     -----     ---          ------------------------
ALT-0130   &#8218;   ‚    Single Low-9 Quotation Mark
ALT-0131   &#402;    ƒ    Latin Small Letter F With Hook
ALT-0132   &#8222;   „    Double Low-9 Quotation Mark
ALT-0133   &#8230;   …    Horizontal Ellipsis
ALT-0134   &#8224;   †    Dagger
ALT-0135   &#8225;   ‡    Double Dagger
ALT-0136   &#710;    ˆ    Modifier Letter Circumflex Accent
ALT-0137   &#8240;   ‰    Per Mille Sign
ALT-0138   &#352;    Š    Latin Capital Letter S With Caron
ALT-0139   &#8249;   ‹    Single Left-Pointing Angle Quotation Mark
ALT-0140   &#338;    Π   Latin Capital Ligature OE
ALT-0145   &#8216;   ‘    Left Single Quotation Mark
ALT-0146   &#8217;   ’    Right Single Quotation Mark
ALT-0147   &#8220;   “    Left Double Quotation Mark
ALT-0148   &#8221;   ”    Right Double Quotation Mark
ALT-0149   &#8226;   •    Bullet
ALT-0150   &#8211;   –    En Dash
ALT-0151   &#8212;   —    Em Dash
ALT-0152   &#732;    ˜    Small Tilde
ALT-0153   &#8482;   ™    Trade Mark Sign
ALT-0154   &#353;    š    Latin Small Letter S With Caron
ALT-0155   &#8250;   ›    Single Right-Pointing Angle Quotation Mark
ALT-0156   &#339;    œ    Latin Small Ligature OE
ALT-0159   &#376;    Ÿ    Latin Capital Letter Y With Diaeresis
joedevivo commented 10 years ago

In riak.conf

##
## Default: ./data
##
## Acceptable values:
##   - the path to a directory
platform_data_dir = ./dataŒ
{platform_data_dir,[46,47,100,97,116,97,338]},
drwxr-xr-x   4 joe  staff   136 Apr 22 08:44 dataŒ
2014-04-22 08:44:48.167 [warning] <0.216.0>@riak_core_ring_manager:reload_ring:355 No ring file available.
2014-04-22 08:44:48.261 [error] <0.222.0> CRASH REPORT Process <0.222.0> with 0 neighbours exited with reason: bad argument in call to eleveldb:destroy([46,47,100,97,116,97,338,47,99,108,117,115,116,101,114,95,109,101,116,97,47,116,114,101,101,115], []) in hashtree:destroy/1 line 262 in gen_server:init_it/6 line 328
2014-04-22 08:44:48.261 [error] <0.202.0> Supervisor riak_core_sup had child riak_core_metadata_hashtree started with riak_core_metadata_hashtree:start_link() at undefined exit with reason bad argument in call to eleveldb:destroy([46,47,100,97,116,97,338,47,99,108,117,115,116,101,114,95,109,101,116,97,47,116,114,101,101,115], []) in hashtree:destroy/1 line 262 in context start_error
2014-04-22 08:44:48.264 [error] <0.200.0> CRASH REPORT Process <0.200.0> with 0 neighbours exited with reason: {{shutdown,{failed_to_start_child,riak_core_metadata_hashtree,{badarg,[{eleveldb,destroy,[[46,47,100,97,116,97,338,47,99,108,117,115,116,101,114,95,109,101,116,97,47,116,114,101,101,115],[]],[]},{hashtree,destroy,1,[{file,"src/hashtree.erl"},{line,262}]},{hashtree_tree,create_node,2,[{file,"src/hashtree_tree.erl"},{line,457}]},{hashtree_tree,new,2,[{file,"src/hashtree_tree.erl"},{line,187}]},{riak_core_metadata_hashtree,init,1,[{file,"src/riak_core_metadata_hashtree.erl"},{line,169}]},{gen_server,...},...]}}},...} in application_master:init/4 line 133
2014-04-22 08:44:48.265 [info] <0.7.0> Application riak_core exited with reason: {{shutdown,{failed_to_start_child,riak_core_metadata_hashtree,{badarg,[{eleveldb,destroy,[[46,47,100,97,116,97,338,47,99,108,117,115,116,101,114,95,109,101,116,97,47,116,114,101,101,115],[]],[]},{hashtree,destroy,1,[{file,"src/hashtree.erl"},{line,262}]},{hashtree_tree,create_node,2,[{file,"src/hashtree_tree.erl"},{line,457}]},{hashtree_tree,new,2,[{file,"src/hashtree_tree.erl"},{line,187}]},{riak_core_metadata_hashtree,init,1,[{file,"src/riak_core_metadata_hashtree.erl"},{line,169}]},{gen_server,...},...]}}},...}

[46,47,100,97,116,97,338,47,99,108,117,115,116,101,114,95,109,101,116,97,47,116,114,101,101,115] = "./dataŒ/cluster_meta/trees"

joedevivo commented 10 years ago
distributed_cookie = riakŒ
09:01:41.482 [info] /Users/joe/Downloads/riak-ee-2.0.0beta1/bin/../etc/advanced.config detected, overlaying proplists
escript: exception error: bad argument
  in function  io_lib:format/2
     called as io_lib:format("~s ~s",['-setcookie',[114,105,97,107,338]])
  in call from cuttlefish_vmargs:stringify_line/2 (src/cuttlefish_vmargs.erl, line 17)
  in call from cuttlefish_vmargs:'-stringify/1-lc$^0/1-0-'/1 (src/cuttlefish_vmargs.erl, line 13)
  in call from cuttlefish_vmargs:'-stringify/1-lc$^0/1-0-'/1 (src/cuttlefish_vmargs.erl, line 13)
  in call from cuttlefish_escript:engage_cuttlefish/1 (src/cuttlefish_escript.erl, line 359)
  in call from cuttlefish_escript:generate/1 (src/cuttlefish_escript.erl, line 235) 
joedevivo commented 10 years ago

so, as far as data dir goes, 1.4 creates the wrong name, but still starts. 2.0 creates the right name, but riak can't start.

joedevivo commented 10 years ago
➜  riak git:(develop) ✗ ./bin/riak console
13:14:28.422 [error] /Users/joe/dev/basho/riak_ee/rel/riak/bin/../etc/riak.conf: Error converting value on line #200 to latin1
Error generating config with cuttlefish
  run `riak config generate -l debug` for more information.
joedevivo commented 10 years ago

Eunit's failing on the builder, but passes locally. will investigate.

joedevivo commented 10 years ago

All good now. didn't check in the test fixtures

seancribbs commented 10 years ago

:+1: 8ef24f7

joedevivo commented 10 years ago

@borshop merge