dmwm / WMCore

Core workflow management components for CMS.
Apache License 2.0
45 stars 106 forks source link

Oracle WMBS schema missing fields #3203

Closed samircury closed 11 years ago

samircury commented 12 years ago

Hi,

It seems the fix is easy, so I'm filing this ticket according to Steve's suggestion.

Not sure if it should go to T0 or WMCore, I'll let you choose.

Complete error log in :

https://cmslogbook.cern.ch/elog/Tier-0+processing/7139

{{{ 012-02-02 17:23:52,389:INFO:JobSubmitter:JobSubmitter.preInitialization 2012-02-02 17:23:52,389:INFO:JobSubmitter:Setting poll interval to 120 seconds 2012-02-02 17:23:52,743:CRITICAL:Harness:PostMortem: choked when initializing with error: (DatabaseError) ORA-00904: "WMBS_LOCATION"."DRAIN": invalid identifier "SELECT wmbs_location.site_name AS site_name,\n wmbs_location.se_name AS se_name,\n
wmbs_location.job_slots,\n wmbs_location.cms_name AS cms_name,\n rc_threshold.max_slots,\n
wmbs_sub_types.name AS task_type,\n job_count.total AS task_running_jobs,\n
rc_threshold.priority,\n wmbs_location.drain\n FROM wmbs_location\n INNER JOIN rc_threshold ON\n wmbs_location.id = rc_threshold.site_id\n INNER JOIN wmbs_sub_types ON\n
rc_threshold.sub_type_id = wmbs_sub_types.id\n LEFT OUTER JOIN\n (SELECT wmbs_job.location AS location, wmbs_subscription.subtype AS subtype,\n COUNT(wmbs_job.id) AS total FROM wmbs_job\n INNER JOIN wmbs_jobgroup ON\n wmbs_job.jobgroup = wmbs_jobgroup.id\n INNER JOIN wmbs_subscription ON\n
wmbs_jobgroup.subscription = wmbs_subscription.id\n INNER JOIN wmbs_job_state ON\n
wmbs_job.state = wmbs_job_state.id\n WHERE wmbs_job_state.name = 'executing'\n GROUP BY wmbs_job.location, wmbs_subscription.subtype) job_count ON\n wmbs_location.id = job_count.location AND\n wmbs_sub_types.id = job_count.subtype\n ORDER BY rc_threshold.priority DESC" {} File "/data/srv/wmagent/0.8.23/sw/slc5_amd64_gcc461/cms/wmcore/0.8.23/lib/python2.6/site-packages/WMCore/Agent/Harness.py", line 422, in startComponent self.prepareToStart() File "/data/srv/wmagent/0.8.23/sw/slc5_amd64_gcc461/cms/wmcore/0.8.23/lib/python2.6/site-packages/WMCore/Agent/Harness.py", line 322, in prepareToStart

}}}

DMWMBot commented 12 years ago

mnorman: Add the drain flag to Oracle and set it to be a VARCHAR(1) since Oracle doesn't have Booleans.

Unfortunately, the code still doesn't work. The main Oracle problem is selection from wmbs_users. This depends on the group and the role, which default to "". In MySQL this is fine because an empty string is an empty string. Oracle turns an empty string into a NULL, and you can't select on a NULL without using IS NULL. I think we're going to have to turn this over to the CRAB guys and ask how they want to proceed with this - and where this code is used. Until we can do that the Oracle code is sort of kaput.

hufnagel commented 12 years ago

hufnagel: Oracle turns an empty string into NULL because that feature was done before there was a SQL standard and for backwards compatibility reasons they never changed it.

We had that problem before. Group and role should never default to '' and I fixed this in some places to get the WMSpec expansion into WMBS working. In fact, no strings anywhere that we store in the database should default to ''.

Btw, if you implement a boolean in Oracle, use a char(1) with a check constraint so that at insert or update time it checks that it's set to either 'Y' or 'N'. Like this

flag char(1) NOT NULL check (flag in ('Y','N'))

sfoulkes commented 12 years ago

sfoulkes: Well, we need to either change the schema or add some defaults that don't confuse things. Mattia and Eric, what do you think is the way to go? We need to resolve this as getting Oracle working is becoming a priority.

hufnagel commented 12 years ago

hufnagel: I don't think any schema changes will address this. For Oracle an empty string is NULL, period. You can either avoid writing empty strings or adjust your application/queries to deal with NULL values.

sfoulkes commented 12 years ago

sfoulkes: We can drop the requirement that the owner field in wmbs_workflow can't be NULL. If we have empty strings for users we just set the owner to be NULL in wmbs_workflow and adjust the SQL queries accordingly.

cinquo commented 12 years ago

mcinquil: Replying to [comment:6 sfoulkes]:

We can drop the requirement that the owner field in wmbs_workflow can't be NULL. If we have empty strings for users we just set the owner to be NULL in wmbs_workflow and adjust the SQL queries accordingly.

In the analysis case it should never happen that the user is NULL or empty. Not sure how's handled in centralized activities. From my side I believe we'll always specify one independently from the schema (that would be something enforced on the schema which is a good protection but extra). So, up to me you can also put 'unknown' as default.

I agree with Dirk on the char(1) for the boolean.

sfoulkes commented 12 years ago

sfoulkes: I was told that having place holders like "unknown" or "default" was a bad idea. Maybe we just need to make sure there is always a user everywhere? For central workflows (T0, T1, T2) we can have a dummy dataops user. We'd change the wmbs_users table so that most fields can't be null.

cinquo commented 12 years ago

mcinquil: Replying to [comment:8 sfoulkes]:

I was told that having place holders like "unknown" or "default" was a bad idea. Maybe we just need to make sure there is always a user everywhere? For central workflows (T0, T1, T2) we can have a dummy dataops user. We'd change the wmbs_users table so that most fields can't be null.

That's fine to me. And on analysis side if we do not have a user it is fine that fails.

sfoulkes commented 12 years ago

sfoulkes: (In 8c5edb6716b95777732726741853cb2fd4cb3c93) Fix the drain flag in Oracle. Fixes #3203.

From: Matt Norman mnorman@fnal.gov Signed-off-by: Steve Foulkes sfoulkes@fnal.gov

sfoulkes commented 12 years ago

sfoulkes: Attached patch changes the owner and group defaults from an empty string to "unknown". Passes all the unit tests. Matt, review?

sfoulkes commented 12 years ago

sfoulkes: (In 21294feb29e623d58ce3cfdbf901a65134bd431b) Change the default owner and group from an empty string to "unknown". Fixes #3203.

From: Steve Foulkes sfoulkes@fnal.gov Signed-off-by: Matt Norman mnorman@fnal.gov

sfoulkes commented 12 years ago

sfoulkes: Mattia, can you double check that my patch won't cause crab problems? We can fix any problems later. I want to get something going so that we don't hold up T0 development.

cinquo commented 12 years ago

mcinquil: It works to me. I am closing it as fixed.