Repeated hibernate queries

markrynbeek commented 13 years ago

When study details are retrieved (bioinvindex/study.seam?studyId=BII-S-7) the amount of hibernate queries seen in the server log is enormous. I count roughly 1800 select statements for a single study (my test database has only one study) almost all duplicate select statements.

A snapshot of the log to illustrate the duplication is given below.

1865 17:18:58,779 INFO [STDOUT] Hibernate: select references0_.id as id1370, references0_.submission_ts as submission2_1370, references0_.ACC as ACC1370, references0_.description as descript4_1370, references0_.name as name137 0, references0_.url as url1370, references0_.version as version1370 from REFERENCESOURCE references0 where references0.id=? 1866 17:18:58,780 INFO [STDOUT] Hibernate: select references0.id as id1370, references0_.submission_ts as submission2_1370, references0_.ACC as ACC1370, references0_.description as descript4_1370, references0_.name as name137 0, references0_.url as url1370, references0_.version as version1370 from REFERENCESOURCE references0 where references0.id=? 1867 17:18:58,781 INFO [STDOUT] Hibernate: select references0.id as id1370, references0_.submission_ts as submission2_1370, references0_.ACC as ACC1370, references0_.description as descript4_1370, references0_.name as name137 0, references0_.url as url1370, references0_.version as version1370 from REFERENCESOURCE references0 where references0.id=? 1868 17:18:58,782 INFO [STDOUT] Hibernate: select references0.id as id1370, references0_.submission_ts as submission2_1370, references0_.ACC as ACC1370, references0_.description as descript4_1370, references0_.name as name137 0, references0_.url as url1370, references0_.version as version1370 from REFERENCESOURCE references0 where references0.id=? 1869 17:18:58,784 INFO [STDOUT] Hibernate: select references0.id as id1370, references0_.submission_ts as submission2_1370, references0_.ACC as ACC1370, references0_.description as descript4_1370, references0_.name as name137 0, references0_.url as url1370, references0_.version as version1370 from REFERENCESOURCE references0 where references0.id=? 1870 17:18:58,785 INFO [STDOUT] Hibernate: select references0.id as id1370, references0_.submission_ts as submission2_1370, references0_.ACC as ACC1370, references0_.description as descript4_1370, references0_.name as name137 0, references0_.url as url1370, references0_.version as version1370 from REFERENCESOURCE references0 where references0.id=? 1871 17:18:58,786 INFO [STDOUT] Hibernate: select references0.id as id1370, references0_.submission_ts as submission2_1370, references0_.ACC as ACC1370, references0_.description as descript4_1370, references0_.name as name137 0, references0_.url as url1370, references0_.version as version1370 from REFERENCESOURCE references0 where references0.id=? 1872 17:18:58,787 INFO [STDOUT] Hibernate: select references0.id as id1370, references0_.submission_ts as submission2_1370, references0_.ACC as ACC1370, references0_.description as descript4_1370, references0_.name as name137 0, references0_.url as url1370, references0_.version as version1370 from REFERENCESOURCE references0 where references0.id=? 1873 17:18:58,788 INFO [STDOUT] Hibernate: select references0.id as id1370, references0_.submission_ts as submission2_1370, references0_.ACC as ACC1370, references0_.description as descript4_1370, references0_.name as name137 0, references0_.url as url1370, references0_.version as version1370 from REFERENCESOURCE references0 where references0_.id=?

eamonnmag commented 13 years ago

Will check this out Mark. Not sure if it's hibernate or if it's the query which is going wrong here.

eamonnmag commented 13 years ago

This should now be fixed.

markrynbeek commented 13 years ago

hi Eamonn

we still encounter the bug. Perhaps we are using old code... Where did you put the fix? Which git repository / commit identifier? We now see this effect when Lucene indexing the database. The retrieval of entries is slow, and gets slower over time. It takes minutes to index a handful of studies.

thanks Mark

eamonnmag commented 13 years ago

Ah ok.

So when I said I solved the problem I meant for the BioInvIndex since I remved the need to query the database.

So to understand the nature of these queries you should realise that the information in the database is stored as a graph. Unfortunately, in order to get all the characteristics and factors, we have to do a query on each material to find out what attributes are attached to it. Hibernate is also adding to the problem by retrieving more than what's necessary.

e.g. material 1 - protocol 1 -

attach attribute organism: homo sapiens attach attribute organism part: liver

I haven't looked too much at the query itself, but I think it can be made much faster by collapsing the multiple queries into one. It was Nataliya and Marco who wrote these parts of the code so I really need to look at it in more detail and inspect whether or not I can optimise the query.

I can point you in the direction of the offending query if you like...you may be able to optimise it quicker than me since i don't deal too much with HQL, only MySQL/Oracle flavours of SQL.

Thanks,

Eamonn

On 13 July 2011 13:44, markrynbeek < reply@reply.github.com>wrote:

hi Eamonn

we still encounter the bug. Perhaps we are using old code... Where did you put the fix? Which git repository / commit identifier? We now see this effect when Lucene indexing the database. The retrieval of entries is slow, and gets slower over time. It takes minutes to index a handful of studies.

thanks Mark

Reply to this email directly or view it on GitHub: https://github.com/ISA-tools/BioInvIndex/issues/28#issuecomment-1562317

Eamonn Maguire
Lead Software Engineer - ISA tools http://www.isa-tools.org

University of Oxford Tel: +44(0)1865 610788 Oxford e-Research Center
7 Keble Road, Oxford skype: maguire1986

OX1 3QG, UK

khaug commented 13 years ago

Looks like it's uk.ac.ebi.bioinvindex.dao.ejb3.AccessibleEJB3DAO.getByAcc():

Setting FlushMode to manual fixed this short term, but we only have a small dataset. This will probably not work with a larger data set(?)

public T getByAcc(String acc) { Class clazz = getPersistentClass(); //return (T) getSession().createCriteria(clazz).add(Restrictions.eq("acc", acc)).uniqueResult(); return (T) getSession().createCriteria(clazz).setFlushMode(org.hibernate.FlushMode.MANUAL).add(Restrictions.eq("acc", acc)).uniqueResult(); }

ISA-tools / BioInvIndex

Repeated hibernate queries #28

OX1 3QG, UK