K2InformaticsGmbH / dderl

Web based Data Explorer / Data Jump Station with Erlang In-Memory Support
Apache License 2.0
23 stars 12 forks source link

Connection Crash for Query with ROWIDs #569

Open stoch opened 5 years ago

stoch commented 5 years ago

image

SELECT
    REVICS_SEPID, REVICS_SHORTID, REVICS_ROWID_MASTER, REVICS_COUNT_EXPECTED,
    REVICS_TABLE_SLAVE, REVICS_ROWID_SLAVE
FROM
    SBS1_ADMIN.REVI_CONTENT_SUB
WHERE
    REVICS_SHORTID = '788'
    AND REVICS_SEPID = '201901'
    AND REVICS_ROWID_SLAVE IS NULL

CREATE TABLE SBS1_ADMIN.REVI_CONTENT_SUB
(
  REVICS_SEPID           VARCHAR2(6 BYTE)       NOT NULL,
  REVICS_SHORTID         VARCHAR2(6 BYTE)       NOT NULL,
  REVICS_ROWID_MASTER    ROWID                  NOT NULL,
  REVICS_COUNT_EXPECTED  NUMBER(2),
  REVICS_TABLE_SLAVE     VARCHAR2(10 BYTE),
  REVICS_ROWID_SLAVE     ROWID
)
NOCOMPRESS 
TABLESPACE SBS1_INFO_DATA_UC
PCTUSED    0
PCTFREE    30
INITRANS   1
MAXTRANS   255
STORAGE    (
            BUFFER_POOL      DEFAULT
           )
PARTITION BY RANGE (REVICS_SEPID)
(  
  PARTITION INFO201808 VALUES LESS THAN ('201809')
    LOGGING
    NOCOMPRESS 
    TABLESPACE SBS1_INFO_DATA_UC
    PCTFREE    30
    INITRANS   1
    MAXTRANS   255
    STORAGE    (
                INITIAL          8M
                NEXT             1M
                MINEXTENTS       1
                MAXEXTENTS       UNLIMITED
                BUFFER_POOL      DEFAULT
               ),  
...
c-bik commented 5 years ago

@stoch Any details from logs?

stoch commented 5 years ago
2019-01-29 08:55:55.727 [error] <0.30284.22>@oci_adapter:logfun:913 [_DDRL_]
     {ocistmt.cpp:ocistmt::execute:277} failed OCIStmtExecute error
     ORA-01445: Kann keine ROWID aus einer Join-View ohne Basistabelle (der View)
         auswählen oder erfassen
         (select concat(OWNER, concat('.', TABLE_NAME)) QUALIFIED_TABLE_NAME,
           ALL_TABLES.ROWID from ALL_TABLES where OWNER = user
           order by TABLE_NAME asc)
2019-01-29 08:58:59.777 [error] <0.3420.23>@gen_server:call gen_server <0.3420.23>
     terminated with reason: no such process or port in call to gen_server:call(<0.30282.22>,
        {port_call,[8,2991920,2953600]}, infinity) in gen_server:call/3 line 214
2019-01-29 08:58:59.777 [error] <0.3420.23>@gen_server:call CRASH REPORT
     Process <0.3420.23> with 0 neighbours exited with reason: no such process or port in call
     to gen_server:call(<0.30282.22>, {port_call,[8,2991920,2953600]}, infinity)
     in gen_server:call/3 line 214
2019-01-29 08:59:04.740 [error] <0.30005.22>@dderl_session:handle_info:191 [_DDRL_]
     {dderl_session,191} dderl_session received unknown msg {'EXIT',<0.3419.23>,{noproc,
    {gen_server,call,[<0.3420.23>,close,3600000]}}} for <<"system">>
2019-01-29 08:59:04.740 [error] <0.3419.23>@gen_server:call gen_statem <0.3419.23>
     in state filling terminated with reason: no such process or port in call to
     gen_server:call(<0.3420.23>, close, 3600000) in gen_server:call/3 line 214
2019-01-29 08:59:04.741 [error] <0.3419.23>@gen_server:call CRASH REPORT Process <0.3419.23>
     with 0 neighbours exited with reason: no such process or port in call to
     gen_server:call(<0.3420.23>, close, 3600000) in gen_server:call/3 line 214
2019-01-29 08:59:04.767 [error] <0.3516.23>@dderloci:exec:59 [_DDRL_] {dderloci,59}
     run_query(SELECT... IS NULL,undefined,select REVICS_SEPID,... null)
     {noproc,{gen_server,call,[<0.30282.22>,{port_call,[4,2991920,
     <<"select REVICS_SEPID, REVICS_SHORTID, ... is null">>]},infinity]}}
2019-01-29 08:59:04.768 [error] <0.3516.23>@oci_adapter:process_query:696
     [_DDRL_] {oci_adapter,696} query error {error,noproc} for
     <<"SELECT\r\n    REVICS_SEPID, ... IS NULL">> whith bind values undefined
2019-01-29 08:59:04.809 [error] <0.3520.23>@dderl_session:spawn_process_call:648 [_DDRL_]
     {dderl_session,648} Problem processing command: exit:{noproc,{gen_statem,call,[<0.3419.23>, 
    {"get_columns"},infinity]}}
    [{<<"button">>,[{<<"connection">>,<<"g2gDZAA...gAtpzA=">>},{<<"statement">>,
    <<"g2gCZAAJZ...Ag==">>},{<<"binds">>,null},{<<"btn">>,<<"close">>}]}]
[{gen,do_call,4,[{file,"gen.erl"},{line,177}]},
 {gen_statem,call,3,[{file,"gen_statem.erl"},{line,461}]},
 {oci_adapter,gui_resp_cb_fun,3,[{file,"lib/dderl/src/oci_adapter.erl"},{line,617}]},
 {oci_adapter,process_cmd,6,[{file,"lib/dderl/src/oci_adapter.erl"},{line,511}]},
 {dderl_session,spawn_process_call,8,[{file,"lib/dderl/src/dderl_session.erl"},{line,645}]}]
c-bik commented 5 years ago

@stoch From the log timestamps I can tell that the following crash

2019-01-29 08:59:04.767 [error] <0.3516.23>@dderloci:exec:59 [_DDRL_] {dderloci,59}
     run_query(SELECT... IS NULL,undefined,select REVICS_SEPID,... null)
     {noproc,{gen_server,call,[<0.30282.22>,{port_call,[4,2991920,
     <<"select REVICS_SEPID, REVICS_SHORTID, ... is null">>]},infinity]}}

...happned due to non existant erloci session which crashed earlier here...

2019-01-29 08:55:55.727 [error] <0.30284.22>@oci_adapter:logfun:913 [_DDRL_]
     {ocistmt.cpp:ocistmt::execute:277} failed OCIStmtExecute error
     ORA-01445: Kann keine ROWID aus einer Join-View ohne Basistabelle (der View)
         auswählen oder erfassen
         (select concat(OWNER, concat('.', TABLE_NAME)) QUALIFIED_TABLE_NAME,
           ALL_TABLES.ROWID from ALL_TABLES where OWNER = user
           order by TABLE_NAME asc)

This looks like a DDErl inbuilt query image ..being modified. I am, however, not sure if the correction/ammend is valid!

Isn't Oracle all_tables is a view? Does Oracle views can have ROWIDs?

cc @acautin

stoch commented 5 years ago

@c-bik Are you sure that ORA-01445 caused a session crash? This is a very normal and expected response for which we would suppress the logs if we could.

c-bik commented 5 years ago

@stoch You are right.

This has nothing to do with erloci at all!

If you look at the the tail of the crash

2019-01-29 08:58:59.777 [error] <0.3420.23>@gen_server:call gen_server <0.3420.23>
     terminated with reason: no such process or port in call to gen_server:call(<0.30282.22>,
        {port_call,[8,2991920,2953600]}, infinity) in gen_server:call/3 line 214

2019-01-29 08:58:59.777 [error] <0.3420.23>@gen_server:call CRASH REPORT
     Process <0.3420.23> with 0 neighbours exited with reason: no such process or port in call
     to gen_server:call(<0.30282.22>, {port_call,[8,2991920,2953600]}, infinity)
     in gen_server:call/3 line 214

2019-01-29 08:59:04.740 [error] <0.30005.22>@dderl_session:handle_info:191 [_DDRL_]
     {dderl_session,191} dderl_session received unknown msg {'EXIT',<0.3419.23>,{noproc,
    {gen_server,call,[<0.3420.23>,close,3600000]}}} for <<"system">>

2019-01-29 08:59:04.740 [error] <0.3419.23>@gen_server:call gen_statem <0.3419.23>
     in state filling terminated with reason: no such process or port in call to
     gen_server:call(<0.3420.23>, close, 3600000) in gen_server:call/3 line 214

2019-01-29 08:59:04.741 [error] <0.3419.23>@gen_server:call CRASH REPORT Process <0.3419.23>
     with 0 neighbours exited with reason: no such process or port in call to
     gen_server:call(<0.3420.23>, close, 3600000) in gen_server:call/3 line 214

2019-01-29 08:59:04.767 [error] <0.3516.23>@dderloci:exec:59 [_DDRL_] {dderloci,59}
     run_query(SELECT... IS NULL,undefined,select REVICS_SEPID,... null)
     {noproc,{gen_server,call,[<0.30282.22>,{port_call,[4,2991920,
     <<"select REVICS_SEPID, REVICS_SHORTID, ... is null">>]},infinity]}}

2019-01-29 08:59:04.768 [error] <0.3516.23>@oci_adapter:process_query:696
     [_DDRL_] {oci_adapter,696} query error {error,noproc} for
     <<"SELECT\r\n    REVICS_SEPID, ... IS NULL">> whith bind values undefined

%
% Failed to find the dderl_fsm process
%

2019-01-29 08:59:04.809 [error] <0.3520.23>@dderl_session:spawn_process_call:648 [_DDRL_]
     {dderl_session,648} Problem processing command: exit:{noproc,{gen_statem,call,[<0.3419.23>, 
    {"get_columns"},infinity]}}
    [{<<"button">>,[{<<"connection">>,<<"g2gDZAA...gAtpzA=">>},{<<"statement">>,
    <<"g2gCZAAJZ...Ag==">>},{<<"binds">>,null},{<<"btn">>,<<"close">>}]}]
[{gen,do_call,4,[{file,"gen.erl"},{line,177}]},
 {gen_statem,call,3,[{file,"gen_statem.erl"},{line,461}]},
 {oci_adapter,gui_resp_cb_fun,3,[{file,"lib/dderl/src/oci_adapter.erl"},{line,617}]},
 {oci_adapter,process_cmd,6,[{file,"lib/dderl/src/oci_adapter.erl"},{line,511}]},
 {dderl_session,spawn_process_call,8,[{file,"lib/dderl/src/dderl_session.erl"},{line,645}]}]

Transferring this issue to DDErl and assigning it @acautin for further investigation!

acautin commented 5 years ago

@c-bik I am not very convinced with the argument as it seems the driver is crashing on the Error response, it is normal and expected the fsm will be closed/cleaned up after the connection is closed.

It can be proven if the connection is really dead by opening multiple tables and trying to refresh on an unrelated one after producing the error.

In any case will give it a look as soon as I have some time.

stoch commented 5 years ago

@acautin Other tables ARE unresponsive, hence we have an fsm crash.

acautin commented 5 years ago

@stoch the opposite is true, if there is an fsm crash it will affect only one table as we have one fsm per opened query, if the crash is in the connection then none of the tables will be usable.

c-bik commented 5 years ago

@c-bik I am not very convinced with the argument as it seems the driver is crashing on the Error response, it is normal and expected the fsm will be closed/cleaned up after the connection is closed.

It can be proven if the connection is really dead by opening multiple tables and trying to refresh on an unrelated one after producing the error.

In any case will give it a look as soon as I have some time.

@c-bik I don't see any calls (or the failed ones) to oci_port modules in the crash logs! That being the only entry point to the driver, where else in these logs are you seeing a driver crash?

acautin commented 5 years ago

@c-bik I don't have proof just not convinced that your explanation comment is correct, but it could be, will need to debug / reproduce to give a definitive answer otherwise it is just guessing.

c-bik commented 5 years ago

@acautin

your explanation

Didn't explain (or tried to) anyting yet 😉

My comment is only an observation on the available logs!

Perhaps the results of your DDErl debugging will give me a strating point to start looking into erloci again.

acautin commented 5 years ago

@c-bik sure, fixed 👍

c-bik commented 5 years ago

The gen_server:call(<0.30282.22>, {port_call,[8,2991920,2953600]}, infinity) in logs below are happening within oci_port module! But the oci_port gen_server it seems is already dead!

2019-01-29 08:58:59.777 [error] <0.3420.23>@gen_server:call gen_server <0.3420.23>
     terminated with reason: no such process or port in call to gen_server:call(<0.30282.22>,
        {port_call,[8,2991920,2953600]}, infinity) in gen_server:call/3 line 214

2019-01-29 08:58:59.777 [error] <0.3420.23>@gen_server:call CRASH REPORT
     Process <0.3420.23> with 0 neighbours exited with reason: no such process or port in call
     to gen_server:call(<0.30282.22>, {port_call,[8,2991920,2953600]}, infinity)
     in gen_server:call/3 line 214

This looks like (I may be wrong), user is trying to log out from a already dead dderl_session

2019-01-29 08:59:04.740 [error] <0.30005.22>@dderl_session:handle_info:191 [_DDRL_]
     {dderl_session,191} dderl_session received unknown msg {'EXIT',<0.3419.23>,{noproc,
    {gen_server,call,[<0.3420.23>,close,3600000]}}} for <<"system">>

There seems to be a gap in logs where dderl is loosing the oci_port process! Also I failed to understand how these are only lines in console/crash/error logs!

@stoch do you have more logs on this perhaps in other files?

stoch commented 5 years ago

This is a fresh DDerl Session after closing and re-opening Chrome:

2019-01-29 14:36:12.983 [info] <0.12094.29>@oci_adapter:process_cmd:106 [DDRL] {oci_adapter,106} user <<"SBS1_ADMIN">>, TNS <<"(DESCRIPTION=(ENABLE=BROKEN)(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=...)(PORT=1521))(ADDRESS=(PROTOCOL=TCP)(HOST=...)(PORT=1521)))(CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=...)))">> 2019-01-29 14:36:13.015 [info] <0.12331.29>@oci_adapter:logfun:911 [DDRL] {erloci.cpp:main:83} [860] Port process configs : erlang term max size 0x00040000 bytes, logging enabled, TCP port for logs 56781, NLS_LANG GERMAN_SWITZERLAND.AL32UTF 2019-01-29 14:36:13.062 [info] <0.12331.29>@oci_adapter:logfun:911 [DDRL] {ocisession.cpp:ocisession::config:69} OCI Initialize 2019-01-29 14:36:13.063 [info] <0.12331.29>@oci_adapter:logfun:911 [DDRL] {oci_port::429} Remote log_enabled 2019-01-29 14:36:13.063 [info] <0.12331.29>@oci_adapter:logfun:911 [DDRL] {command.cpp:command::change_log_flag:54} Enabled logging.. 2019-01-29 14:36:13.237 [info] <0.12331.29>@oci_adapter:logfun:911 [DDRL] {ocisession.cpp:ocisession::ocisession:128} got session 000...A0 (DESCRIPTION=(ENABLE=BROKEN)(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=...)(PORT=1521))(ADDRESS=(PROTOCOL=TCP)(HOST=...)(PORT=1521)))(CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=...))) user SBS1_ADMIN 2019-01-29 14:36:35.627 [error] <0.12879.29>@gen_server:call gen_server <0.12879.29> terminated with reason: no such process or port in call to gen_server:call(<0.12328.29>, {port_call,[8,3647184,3608672]}, infinity) in gen_server:call/3 line 214 2019-01-29 14:36:35.628 [error] <0.12879.29>@gen_server:call CRASH REPORT Process <0.12879.29> with 0 neighbours exited with reason: no such process or port in call to gen_server:call(<0.12328.29>, {port_call,[8,3647184,3608672]}, infinity) in gen_server:call/3 line 214

stoch commented 5 years ago

And the crash log:

2019-01-29 14:36:35 =ERROR REPORT==== Generic server <0.12879.29> terminating Last message in was {'$gen_cast',{fetch_recs_async,false,0}} When Server state == {qry,[{fields,[<<"REVICS_SEPID">>,<<"REVICS_SHORTID">>,<<"REVICS_ROWID_MASTER">>,<<"REVICS_COUNT_EXPECTED">>,<<"REVICS_TABLE_SLAVE">>,<<"REVICS_ROWID_SLAVE">>]},{from,[<<"SBS1_ADMIN.REVI_CONTENT_SUB">>]},{where,{and,{and,{'=',<<"REVICS_SHORTID">>,<<"'788'">>},{'=',<<"REVICS_SEPID">>,<<"'201901'">>}},{is,<<"REVICS_ROWID_SLAVE">>,<<"null">>}}},{'hierarchical query',{}},{'group by',[]},{having,{}},{'order by',[]}],true,{stmtResult,0,{oci_port,statement,<0.12328.29>,3647184,3608672},[{stmtCol,<<"REVICS_SEPID">>,<<"REVICS_SEPID">>,'SQLT_CHR',12,0,false},{stmtCol,<<"REVICS_SHORTID">>,<<"REVICS_SHORTID">>,'SQLT_CHR',12,0,false},{stmtCol,<<"REVICS_ROWID_MASTER">>,<<"REVICS_ROWID_MASTER">>,'SQLT_RDD',8,0,false},{stmtCol,<<"REVICS_COUNT_EXPECTED">>,<<"REVICS_COUNT_EXPECTED">>,'SQLT_NUM',undefined,0,false},{stmtCol,<<"REVICS_TABLE_SLAVE">>,<<"REVICS_TABLE_SLAVE">>,'SQLT_CHR',20,0,false},{stmtCol,<<"REVICS_ROWID_SLAVE">>,<<"REVICS_ROWID_SLAVE">>,'SQLT_RDD',8,0,false}],#Fun,#Fun,[]},{dderl_fsm,<0.12883.29>},200000,undefined,false} Reason for termination == ** {{noproc,{gen_server,call,[<0.12328.29>,{port_call,[8,3647184,3608672]},infinity]}},[{gen_server,call,3,[{file,"gen_server.erl"},{line,214}]},{gen_server,try_terminate,3,[{file,"gen_server.erl"},{line,648}]},{gen_server,terminate,10,[{file,"gen_server.erl"},{line,833}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,247}]}]} 2019-01-29 14:36:35 =CRASH REPORT==== crasher: initial call: dderloci:init/1 pid: <0.12879.29> registered_name: [] exception exit: {{noproc,{gen_server,call,[<0.12328.29>,{port_call,[8,3647184,3608672]},infinity]}},[{gen_server,call,3,[{file,"gen_server.erl"},{line,214}]},{gen_server,try_terminate,3,[{file,"gen_server.erl"},{line,648}]},{gen_server,terminate,10,[{file,"gen_server.erl"},{line,833}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,247}]}]} ancestors: [<0.12882.29>] message_queue_len: 0 messages: [] links: [] dictionary: [] trap_exit: false status: running heap_size: 6772 stack_size: 27 reductions: 12482 neighbours:

c-bik commented 5 years ago

@stoch Thanks 👍

So both the new logs shows same gap for info where oci_port process went down. I guess we have to wait if some more details are found in @acautin experiments which might shed some new lights.