dmwm / PHEDEX

CMS data-placement suite
8 stars 18 forks source link

InspectPhedexLog fails when there is an Oracle error in the log #979

Open merictaze opened 9 years ago

merictaze commented 9 years ago

Hi,

InspectPhedexLog script gives error and exists script if there is any Oracle error in the log. More specifically, it exists at line 578 [*], I debugged it, but I couldn't understand the reason as I've never used perl before. but it's a bit strange to me that it prints the correct value but could not evaluate it at the line 578

perl -d /data/ProdNodes/PHEDEX/Utilities/InspectPhedexLog -v -e log_only_oracle.out
  DB<1> b 578
  DB<2> c
578:      return undef if $#{@{$data}} < 0;
  DB<2> p $#{@{$data}}
-1
  DB<3> p $#{@{$data}} < 0
1
  DB<4> n
main::simpleHisto(/data/ProdNodes/PHEDEX/Utilities/InspectPhedexLog:578):
578:      return undef if $#{@{$data}} < 0;
  DB<4> n
main::simpleHisto(/data/ProdNodes/PHEDEX/Utilities/InspectPhedexLog:578):
578:      return undef if $#{@{$data}} < 0;
  DB<4> 
Can't use string ("1") as an ARRAY ref while "strict refs" in use at /data/ProdNodes/PHEDEX/Utilities/InspectPhedexLog line 578.
 at /data/ProdNodes/PHEDEX/Utilities/InspectPhedexLog line 578
    main::simpleHisto('ARRAY(0xf3d7b8)', 10) called at /data/ProdNodes/PHEDEX/Utilities/InspectPhedexLog line 467

Log file that I used

2014-11-26 10:02:00: FileDownload[5669]: alert: DBD::Oracle::st execute failed: ORA-01031: insufficient privileges (DBD ERROR: error possibly near <*> indicator at char 11 in 'merge into <*>t_agent_status ast using (select :node node, :agent agent, :label label, :wid worker_id, :fqdn host_name, :dir directory_path, :pid process_id, 1 state, :npending queue_pending, :nreceived queue_received, :nwork queue_work, :ncompleted queue_completed, :nbad queue_bad, :noutgoing queue_outgoing, :now time_update from dual) i on (ast.node = i.node and ast.agent = i.agent and ast.label = i.label and ast.worker_id = i.worker_id) when matched then update set ast.host_name       = i.host_name, ast.directory_path  = i.directory_path, ast.process_id      = i.process_id, ast.state           = i.state, ast.queue_pending   = i.queue_pending, ast.queue_received  = i.queue_received, ast.queue_work      = i.queue_work, ast.queue_completed = i.queue_completed, ast.queue_bad       = i.queue_bad, ast.queue_outgoing  = i.queue_outgoing, ast.time_update     = i.time_update when not matched then insert (node, agent, label, worker_id, host_name, directory_path, process_id, state, queue_pending, queue_received, queue_work, queue_completed, queue_bad, queue_outgoing, time_update) values (i.node, i.agent, i.label, i.worker_id, i.host_name, i.directory_path, i.process_id, i.state, i.queue_pending, i.queue_received, i.queue_work, i.queue_completed, i.queue_bad, i.queue_outgoing, i.time_update)') [for Statement "merge into t_agent_status ast using (select :node node, :agent agent, :label label, :wid worker_id, :fqdn host_name, :dir directory_path, :pid process_id, 1 state, :npending queue_pending, :nreceived queue_received, :nwork queue_work, :ncompleted queue_completed, :nbad queue_bad, :noutgoing queue_outgoing, :now time_update from dual) i on (ast.node = i.node and ast.agent = i.agent and ast.label = i.label and ast.worker_id = i.worker_id) when matched then update set ast.host_name       = i.host_name, ast.directory_path  = i.directory_path, ast.process_id      = i.process_id, ast.state           = i.state, ast.queue_pending   = i.queue_pending, ast.queue_received  = i.queue_received, ast.queue_work      = i.queue_work, ast.queue_completed = i.queue_completed, ast.queue_bad       = i.queue_bad, ast.queue_outgoing  = i.queue_outgoing, ast.time_update     = i.time_update when not matched then insert (node, agent, label, worker_id, host_name, directory_path, process_id, state, queue_pending, queue_received, queue_work, queue_completed, queue_bad, queue_outgoing, time_update) values (i.node, i.agent, i.label, i.worker_id, i.host_name, i.directory_path, i.process_id, i.state, i.queue_pending, i.queue_received, i.queue_work, i.queue_completed, i.queue_bad, i.queue_outgoing, i.time_update)" with ParamValues: :agent='262', :dir='/data/ProdNodes/Prod_T0_CH_CERN_Export/state', :fqdn='vocms0214.cern.ch', :label='download-t2', :nbad=0, :ncompleted=0, :node='1', :noutgoing=0, :now=1416996120.00759, :npending=0, :nreceived=0, :nwork=1, :pid=5669, :wid='M'] at /data/ProdNodes/PHEDEX/perl_lib/PHEDEX/Core/DB.pm line 322.

AFAIK, it was running on old PhEDEx voboxes without any problem (or there was no oracle error in the logs), so it might be related with perl version if it interprets some signs differently. not sure though(I don't have perl version for old machines anymore).

[*] https://github.com/dmwm/PHEDEX/blob/master/Utilities/InspectPhedexLog#L578