dmwm / PHEDEX

CMS data-placement suite
8 stars 18 forks source link

Subscriptions API not protected for divide-by-zero #900

Open ericvaandering opened 10 years ago

ericvaandering commented 10 years ago

Original Savannah ticket 95976 reported by huangch on Wed Jul 11 04:16:10 2012.

The subscriptions API can occasionally return an error, with the logfiles indicating a division by zero:

DBD::Oracle::st fetchrow_hashref failed: ORA-01476: divisor is equal to zero (DBD ERROR: OCIStmtFetch) [for Statement "select sp.request, ds."level", ds.item_id, ds.item_name, ds.is_open open, sp.time_create time_update, ds.dataset_id, ds.dataset_name, n.id node_id, n.name node, n.se_name se, ds.is_move move, sp.priority, sp.is_custodial custodial, g.name "group", case when ds.time_suspend_until > :now then 'y' else 'n' end suspended, ds.time_suspend_until suspend_until, ds.time_create, ds.files, ds.bytes, ds.ds_files, ds.ds_bytes, ds.node_files, ds.node_bytes, ds.percent_bytes, ds.percent_files, rx.time_start from t_dps_subs_param sp join ( select 'DATASET' "level", sd.param, sd.dataset item_id, d.name item_name, d.name dataset_name, d.id dataset_id, sd.time_create, sd.time_suspend_until, sd.time_complete, sd.time_done, sd.is_move, sd.destination, d.time_update, d.is_open, null bytes, null files, ds_stat.files ds_files, ds_stat.bytes ds_bytes, reps.node_bytes, reps.node_files, (reps.node_bytes * 100 / ds_stat.bytes) percent_bytes, (reps.node_files * 100 / ds_stat.files) percent_files from t_dps_subs_dataset sd join t_dps_dataset d on d.id = sd.dataset left join (select br.node, b.dataset, sum(br.node_bytes) node_bytes, sum(br.node_files) node_files from t_dps_block_replica br join t_dps_block b on br.block = b.id group by br.node, b.dataset ) reps on reps.node = sd.destination and reps.dataset = d.id join (select d.id id, sum(b.files) files, sum(b.bytes) bytes from t_dps_dataset d join t_dps_block b on b.dataset = d.id group by d.id ) ds_stat on ds_stat.id = d.id union select case when (select distinct param from t_dps_subs_dataset sd where sd.param = sb.param) is null then 'BLOCK' else 'DATASET' end "level", sb.param, sb.block item_id, b.name item_name, d.name dataset_name, d.id dataset_id, sb.time_create, sb.time_suspend_until, sb.time_complete, sb.time_done, sb.is_move, sb.destination, b.time_update, b.is_open, b.bytes, b.files, ds_stat.files ds_files, ds_stat.bytes ds_bytes, br.node_bytes, br.node_files, (br.node_bytes * 100 / b.bytes) percent_bytes, (br.node_files * 100 / b.files) percent_files from t_dps_subs_block sb join t_dps_block b on b.id = sb.block join t_dps_dataset d on d.id = b.dataset left join t_dps_block_replica br on br.node = sb.destination and br.block = b.id join (select d.id id, sum(b.files) files, sum(b.bytes) bytes from t_dps_dataset d join t_dps_block b on b.dataset = d.id group by d.id ) ds_stat on ds_stat.id = d.id where sb.param not in ( select param from t_dps_subs_dataset) ) ds on ds.param = sp.id join t_adm_node n on ds.destination = n.id left join t_req_xfer rx on rx.request = sp.request left join t_adm_group g on g.id = sp.user_group where ((n.name = :n_name1) and ds.time_create >= :create_since and ds.percent_files <= :percent_max) order by ds.time_create desc, ds.dataset_name desc, ds.item_name desc, n.name" with ParamValues: :create_since='1339403843', :n_name1='T1_DE_KIT_MSS', :now=1341995892, :percent_max='99.99999'].

Presumably this can happen if there is a block which has no files in it yet (or there is a block with files which are all of zero bytes). Could you please protect the calculations of percent_* fields against such division by zero?

ericvaandering commented 10 years ago

Comment by huangch on Thu Jul 19 16:53:57 2012

This has been fixed by using decode():

for example:

+verbatim+ .... decode(b.bytes, 0, 0, (br.node_bytes * 100 / b.bytes)) percent_bytes, decode(b.files, 0, 0, (br.node_files * 100 / b.files)) percent_files ....

-verbatim-

The fix has been committed to CVS.