prashant-r / Scalaris

DHT Chord Transaction
Apache License 2.0
0 stars 0 forks source link

boot.sh crashes under synthetic load #7

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
What steps will reproduce the problem?
* start boot.sh
* startup a trivial java interface that accepts HTTP PUTs and turns them
into scalaris OtpErlangString writes
* nail that web interface with many synthetic 1k PUTs
* non-deterministic crash - happened after a few hours of developing the
interface

What is the expected output?
* value is committed despite heavy load

What do you see instead?
* crashed boot.sh

What version of the product are you using? On what operating system?
* SVN http://scalaris.googlecode.com/svn/trunk revision 121
* Mac OS X 10.5.6
* Erlang (BEAM) emulator version 5.6.5 [source] [smp:2] [async-threads:0]
[kernel-poll:false]

Please provide any additional information below.
* code for REST interface available upon request
* this was after experimenting with trying to do erlang type conversion in
java - ala

                    OtpOutputStream oos = new OtpOutputStream(otpValue);
                    OtpInputStream ois = new OtpInputStream(oos.toByteArray());
                    OtpErlangBinary bin;
                    String answer = "";
                    try {
                        bin = new OtpErlangBinary(ois);
                        answer = new String(bin.binaryValue());
                    } catch (OtpErlangDecodeException e) {
                        // TODO Auto-generated catch block
                        status = HttpServletResponse.SC_INTERNAL_SERVER_ERROR; //500
                        error = e.toString();
                        is_error = true;
                    }
                    response.getOutputStream().print(answer);

(because i wanted a byte-for-byte reproduction of what was in the store put
into the OutputStream.)

Original issue reported on code.google.com by po...@lindenlab.com on 15 Jan 2009 at 8:26

GoogleCodeExporter commented 8 years ago
poppy@levitate:~/Source/scalaris-read-only/bin$ ./boot.sh 
Erlang (BEAM) emulator version 5.6.5 [source] [smp:2] [async-threads:4]
[kernel-poll:false]

Eshell V5.6.5  (abort with ^G)
(boot@levitate)1> Config files: ["scalaris.cfg","scalaris.local.cfg"]
Can't load config file "scalaris.local.cfg": enoent. Ignoring.
listening on {0,0,0,0}:14195

=INFO REPORT==== 14-Jan-2009::18:58:46 ===
Yaws: Listening to 0.0.0.0:8000 for servers
 - http://localhost:8000 under ../docroot
[ I | RM     | <0.92.0> ] starting ring maintainer
[ I | RT     | <0.94.0> ] starting ringtable
this() == {{127,0,0,1},14195}
[ I | Node   | <0.97.0> ] joining 135547905503803006856817392082617988910 
[ I | Node   | <0.97.0> ] join as first 135547905503803006856817392082617988910 
[ I | FD     | <0.69.0> ] starting pinger for {{127,0,0,1},14195,<0.97.0>}
[ I | Node   | <0.97.0> ] joined

=ERROR REPORT==== 14-Jan-2009::20:38:39 ===
Error in process <0.85.0> on node 'boot@levitate' with exit value:
{{badmatch,undefined},[{proc_lib,exit_p,2}]}

=ERROR REPORT==== 14-Jan-2009::23:03:29 ===
yaws: Failed to accept - terminating: {error,emfile}

=ERROR REPORT==== 14-Jan-2009::23:03:29 ===
Accept proc died, terminate gserv
=ERROR REPORT==== 14-Jan-2009::23:03:29 ===
yaws: FATAL gserv died shutdown

=ERROR REPORT==== 14-Jan-2009::23:03:29 ===
** Generic server yaws_server terminating 
** Last message in was {'EXIT',<0.84.0>,shutdown}
** When Server state == {state,
                            {gconf,
                                "/Volumes/Data/Source/scalaris-read-only/contrib/yaws",
                                false,204,"../log",

["/Volumes/Data/Source/scalaris-read-only/contrib/yaws/examples/ebin"],
                                [],15000,400,1000000,8000,10240,[],10000000,
                                30,

["/Volumes/Data/Source/scalaris-read-only/contrib/yaws/examples/include"],
                                "/usr/bin/php-cgi","Yaws 1.77","default",
                                false},
                            [{<0.84.0>,
                              [{sconf,8000,3,[],undefined,undefined,
                                   "../docroot",[],
                                   {0,0,0,0},
                                   "localhost",40985,undefined,[],nolimit,[],
                                   yaws_404,yaws_404,yaws,
                                   "boot_server_1995904074",undefined,
                                   [yaws,php,cgi],
                                   [],[]}]}],
                            14,true}
** Reason for termination == 
** {restartme,[{yaws_server,handle_info,2},
               {gen_server,handle_msg,5},
               {proc_lib,init_p_do_apply,3}]}

=ERROR REPORT==== 14-Jan-2009::23:03:29 ===
Error in process <0.8016.6805> on node 'boot@levitate' with exit value:
{emfile,[{erlang,open_port,[{spawn,"/bin/sh -s unix:cmd
2>&1"},[stream]]},{os,start_port,0},{os,'-unix_cmd/1-fun-0-',2}]}

=ERROR REPORT==== 14-Jan-2009::23:03:29 ===
Failed to list "/Users/poppy/.yaws/yaws/default" probably due to permission 
errs: emfile
=ERROR REPORT==== 14-Jan-2009::23:03:30 ===
** Generic server yaws_server terminating 
** Last message in was {setconf,
                           {gconf,
                               "/Volumes/Data/Source/scalaris-read-only/contrib/yaws",
                               false,204,"../log",

["/Volumes/Data/Source/scalaris-read-only/contrib/yaws/examples/ebin"],
                               [],15000,400,1000000,8000,10240,[],10000000,30,

["/Volumes/Data/Source/scalaris-read-only/contrib/yaws/examples/include"],
                               "/usr/bin/php-cgi","Yaws 1.77","default",false},
                           [[{sconf,8000,3,[],undefined,undefined,
                                 "../docroot",[],
                                 {0,0,0,0},
                                 "localhost",undefined,undefined,[],nolimit,
                                 [],yaws_404,yaws_404,yaws,
                                 "boot_server_1995904074",undefined,
                                 [yaws,php,cgi],
                                 [],[]}]]}
** When Server state == {state,undefined,[],0,true}
** Reason for termination == 
** {emfile,[{yaws_server,setup_dirs,1},
            {yaws_server,init2,5},
            {yaws_server,handle_call,3},
            {gen_server,handle_msg,5},
            {proc_lib,init_p_do_apply,3}]}

=ERROR REPORT==== 14-Jan-2009::23:03:30 ===
Error in process <0.9982.6805> on node 'boot@levitate' with exit value:
{emfile,[{erlang,open_port,[{spawn,"/bin/sh -s unix:cmd
2>&1"},[stream]]},{os,start_port,0},{os,'-unix_cmd/1-fun-0-',2}]}

=ERROR REPORT==== 14-Jan-2009::23:03:30 ===
Failed to list "/Users/poppy/.yaws/yaws/default" probably due to permission 
errs: emfile
=ERROR REPORT==== 14-Jan-2009::23:03:30 ===
** Generic server yaws_server terminating 
** Last message in was {setconf,
                           {gconf,
                               "/Volumes/Data/Source/scalaris-read-only/contrib/yaws",
                               false,204,"../log",

["/Volumes/Data/Source/scalaris-read-only/contrib/yaws/examples/ebin"],
                               [],15000,400,1000000,8000,10240,[],10000000,30,

["/Volumes/Data/Source/scalaris-read-only/contrib/yaws/examples/include"],
                               "/usr/bin/php-cgi","Yaws 1.77","default",false},
                           [[{sconf,8000,3,[],undefined,undefined,
                                 "../docroot",[],
                                 {0,0,0,0},
                                 "localhost",undefined,undefined,[],nolimit,
                                 [],yaws_404,yaws_404,yaws,
                                 "boot_server_1995904074",undefined,
                                 [yaws,php,cgi],
                                 [],[]}]]}
** When Server state == {state,undefined,[],0,true}
** Reason for termination == 
** {emfile,[{yaws_server,setup_dirs,1},
            {yaws_server,init2,5},
            {yaws_server,handle_call,3},
            {gen_server,handle_msg,5},
            {proc_lib,init_p_do_apply,3}]}

=ERROR REPORT==== 14-Jan-2009::23:03:30 ===
Error in process <0.12243.6805> on node 'boot@levitate' with exit value:
{emfile,[{erlang,open_port,[{spawn,"/bin/sh -s unix:cmd
2>&1"},[stream]]},{os,start_port,0},{os,'-unix_cmd/1-fun-0-',2}]}

=ERROR REPORT==== 14-Jan-2009::23:03:30 ===
Failed to list "/Users/poppy/.yaws/yaws/default" probably due to permission 
errs: emfile
=ERROR REPORT==== 14-Jan-2009::23:03:30 ===
** Generic server yaws_server terminating 
** Last message in was {setconf,
                           {gconf,
                               "/Volumes/Data/Source/scalaris-read-only/contrib/yaws",
                               false,204,"../log",

["/Volumes/Data/Source/scalaris-read-only/contrib/yaws/examples/ebin"],
                               [],15000,400,1000000,8000,10240,[],10000000,30,

["/Volumes/Data/Source/scalaris-read-only/contrib/yaws/examples/include"],
                               "/usr/bin/php-cgi","Yaws 1.77","default",false},
                           [[{sconf,8000,3,[],undefined,undefined,
                                 "../docroot",[],
                                 {0,0,0,0},
                                 "localhost",undefined,undefined,[],nolimit,
                                 [],yaws_404,yaws_404,yaws,
                                 "boot_server_1995904074",undefined,
                                 [yaws,php,cgi],
                                 [],[]}]]}
** When Server state == {state,undefined,[],0,true}
** Reason for termination == 
** {emfile,[{yaws_server,setup_dirs,1},
            {yaws_server,init2,5},
            {yaws_server,handle_call,3},
            {gen_server,handle_msg,5},
            {proc_lib,init_p_do_apply,3}]}

=ERROR REPORT==== 14-Jan-2009::23:03:30 ===
Error in process <0.13534.6805> on node 'boot@levitate' with exit value:
{emfile,[{erlang,open_port,[{spawn,"/bin/sh -s unix:cmd
2>&1"},[stream]]},{os,start_port,0},{os,'-unix_cmd/1-fun-0-',2}]}

=ERROR REPORT==== 14-Jan-2009::23:03:30 ===
Failed to list "/Users/poppy/.yaws/yaws/default" probably due to permission 
errs: emfile
=ERROR REPORT==== 14-Jan-2009::23:03:30 ===
** Generic server yaws_server terminating 
** Last message in was {setconf,
                           {gconf,
                               "/Volumes/Data/Source/scalaris-read-only/contrib/yaws",
                               false,204,"../log",

["/Volumes/Data/Source/scalaris-read-only/contrib/yaws/examples/ebin"],
                               [],15000,400,1000000,8000,10240,[],10000000,30,

["/Volumes/Data/Source/scalaris-read-only/contrib/yaws/examples/include"],
                               "/usr/bin/php-cgi","Yaws 1.77","default",false},
                           [[{sconf,8000,3,[],undefined,undefined,
                                 "../docroot",[],
                                 {0,0,0,0},
                                 "localhost",undefined,undefined,[],nolimit,
                                 [],yaws_404,yaws_404,yaws,
                                 "boot_server_1995904074",undefined,
                                 [yaws,php,cgi],
                                 [],[]}]]}
** When Server state == {state,undefined,[],0,true}
** Reason for termination == 
** {emfile,[{yaws_server,setup_dirs,1},
            {yaws_server,init2,5},
            {yaws_server,handle_call,3},
            {gen_server,handle_msg,5},
            {proc_lib,init_p_do_apply,3}]}

=ERROR REPORT==== 14-Jan-2009::23:03:30 ===
Error in process <0.15175.6805> on node 'boot@levitate' with exit value:
{emfile,[{erlang,open_port,[{spawn,"/bin/sh -s unix:cmd
2>&1"},[stream]]},{os,start_port,0},{os,'-unix_cmd/1-fun-0-',2}]}

=ERROR REPORT==== 14-Jan-2009::23:03:30 ===
Failed to list "/Users/poppy/.yaws/yaws/default" probably due to permission 
errs: emfile
=ERROR REPORT==== 14-Jan-2009::23:03:30 ===
** Generic server yaws_server terminating 
** Last message in was {setconf,
                           {gconf,
                               "/Volumes/Data/Source/scalaris-read-only/contrib/yaws",
                               false,204,"../log",

["/Volumes/Data/Source/scalaris-read-only/contrib/yaws/examples/ebin"],
                               [],15000,400,1000000,8000,10240,[],10000000,30,

["/Volumes/Data/Source/scalaris-read-only/contrib/yaws/examples/include"],
                               "/usr/bin/php-cgi","Yaws 1.77","default",false},
                           [[{sconf,8000,3,[],undefined,undefined,
                                 "../docroot",[],
                                 {0,0,0,0},
                                 "localhost",undefined,undefined,[],nolimit,
                                 [],yaws_404,yaws_404,yaws,
                                 "boot_server_1995904074",undefined,
                                 [yaws,php,cgi],
                                 [],[]}]]}
** When Server state == {state,undefined,[],0,true}
** Reason for termination == 
** {emfile,[{yaws_server,setup_dirs,1},
            {yaws_server,init2,5},
            {yaws_server,handle_call,3},
            {gen_server,handle_msg,5},
            {proc_lib,init_p_do_apply,3}]}

=ERROR REPORT==== 14-Jan-2009::23:03:30 ===
** Generic server <0.95.0> terminating 
** Last message in was {'EXIT',<0.90.0>,killed}
** When Server state == {state,
                            {<0.95.0>,cs_sup_and},
                            one_for_all,
                            [{child,<0.97.0>,cs_node,
                                 {cs_node,start_link,["cs_node_2615678616"]},
                                 permanent,brutal_kill,worker,[]},
                             {child,<0.96.0>,cs_db_otp,
                                 {cs_db_otp,start_link,["cs_node_2615678616"]},
                                 permanent,brutal_kill,worker,[]}],
                            {dict,0,16,16,8,80,48,
                                {[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],
                                 []},
                                {{[],[],[],[],[],[],[],[],[],[],[],[],[],[],
                                  [],[]}}},
                            10,1,[],cs_sup_and,
                            ["cs_node_2615678616"]}
** Reason for termination == 
** killed

=INFO REPORT==== 14-Jan-2009::23:03:30 ===
    application: boot_cs
    exited: shutdown
    type: temporary

(boot@levitate)1> 

Original comment by po...@lindenlab.com on 15 Jan 2009 at 8:30

GoogleCodeExporter commented 8 years ago
Hi,

thanks for your issue report on scalaris. Here is what I found out:

- The first error message at 20:38 seems to be related to yaws (the builtin 
webserver). They released a new version of yaws a week ago, which gives a more 
meaningful error message in this case or fixes the problem. However, they 
pulled the new version yesterday, because of some undisclosed bugs. As soon as 
they re-release the code, I will update scalaris to the new yaws version. 
Either the problem goes away or we get a better error message.

- The second bunch of error messages is around 23:03. These error messages 
hint at scalaris running out of file descriptors (emfile). Could you check 
with lsof what kind of files scalaris is opening? My best bet at the moment is 
that there are a lot of TCP connections between scalaris and your Java server.

Each Scalaris object will create its own TCP connection. I don't know whether 
you are reusing Scalaris objects on the server or not. With 
Scalaris.closeConnection() you can explicitly close the TCP connection once 
you don't need the Scalaris object anymore.

Thorsten

Original comment by schu...@gmail.com on 16 Jan 2009 at 8:27

GoogleCodeExporter commented 8 years ago

Original comment by schu...@gmail.com on 27 Mar 2009 at 2:44