Closed evansde77 closed 12 years ago
swakef: Doesn't seem fixed... http://dmwm.cern.ch:8080/job/WMCore-py2.6-mysql/305/
hufnagel: What do I have to do to actually see an error message instead of just 404 page ?
swakef: If you go to dmwm.cern.ch:8080 you will get a signup link, create an account with the username hufnagel
lat: Didn't help, still get 404 page.
swakef: Replying to [comment:54 lat]:
Didn't help, still get 404 page.
Try now, an admin needs to give each account permissions.
mnorman: I can't actually tell what the state is. Someone keeps aborting the attempted builds, so we haven't run anything since putting in the last fix.
swakef: Running the test manually should reproduce it i.e. python setup.py test --buldBotMode=true.
currently we hit some db retry logic which makes the tests take 4 hours. So the current one will be running for another 3 hours.
hufnagel: Matt, what did you fix exactly ? I looked at the logs (some of them at least) from the last build and couldn't identify what would be wrong in the patches for this ticket.
For example:
test/python/WMComponent_t/AlertGenerator_t/AlertGenerator_t.py def setUp(self): self.testInit = TestInit(file) self.testInit.setLogging(logLevel = logging.DEBUG) self.testInit.clearDatabase() self.testInit.setDatabaseConnection()
This code has two problems:
1) How can a call to clearDatabase before we setup the database connection ever work ?! 2) We explicitly said in this thread that by policy unit tests should never delete the database in setUp. The database needs to be empty or the unit test should fail.
So I really see nothing here to change. The unit test should be changed and the clearDatabase should be removed from setUp.
swakef: Replying to [comment:58 hufnagel]:
test/python/WMComponent_t/AlertGenerator_t/AlertGenerator_t.py def setUp(self): self.testInit = TestInit(file) self.testInit.setLogging(logLevel = logging.DEBUG) self.testInit.clearDatabase() self.testInit.setDatabaseConnection()
This code was changed in 90f756f54c2e20be8fccd46fd37d02654761a750. Which is what the current build is using.
mnorman: One possible problem here is weirdness in the Transaction_t unittest, which sets up and destroys DB elements with archaic code.
Filed as #2617
hufnagel: Really strange problem. As Matt said, the problem started in the Transaction_t unittest, which left a transaction open, then called clearDatabase and then committed the transaction.
The operations in that transaction were operating on tables specific to the unittest, so the commit failed of course (because we cleared the db). Somehow the operations stayed in the MySQL buffer though, because we observed error messaged related to them in later unittests (which did not use the same tables).
Then a few unitests later we started getting messages about no default database being present anymore, so the assumption is that MySQL finally gave up with the operations it could not run and in addition also unset the default database. Could even be a MySQL bug.
Fixing #2167 should fix the jenkins build again. As a precaution Matt will also add a check in WmInit.clearDatabase if there are open transactions and commit them before we wipe the db.
Only remaining question is what to do with the MySQL Destroy plugin. IMO we should revert to the previous version without the "no default database => create on" hack. The current hack would not have prevented the original problem anyways, just recovered things once everything fell apart. I think dropping the default database is a serious enough problem that we do not need to put in recovery code for that, we just need to fix the problem(s) that cause it to happen.
mnorman: Transaction commit filed as #2618
mnorman: I think things might be alive and well for now, and am suggesting we close this ticket unless someone complains.
Both DBS and Tier 0 will need an Oracle supporting version of the WMQuality/TestInit.py module to set up and tear down database instances for unitests.
CC'ing Lassi, since he may already have something along these lines for SiteDB.