dkpro / dkpro-uby

Framework for creating and accessing UBY resources – sense-linked lexical resources in standard UBY-LMF format
https://dkpro.github.io/dkpro-uby
Other
22 stars 3 forks source link

Disable non-lazy loading to allow for streaming #108

Closed judithek closed 9 years ago

judithek commented 9 years ago
Some applications require iterating over all rows of a class (e.g., all LexicalEntries).
One example is our own DBToXMLTransformer in the persistence module. Of course, loading
everything at once fills up memory quite fast. This is why DBToXMLTransformer uses
a buffered CriteriaIterator, which internally relies on MySQL's "LIMIT" and
"OFFSET" commands.

Unfortunately, MySQL becomes highly inefficient if large offset numbers are used. This
is explained here: http://www.numerati.com/2012/06/26/reading-large-result-sets-with-hibernate-and-mysql/
It can also be seen in simple experiments (I measured > 10 seconds for a single
offset query which lengthens the export of large lexicons such as Wiktionary to >
0.5 days.

One solution that is proposed in the article linked above is streaming the result set.
This has the drawback that during streaming no other queries may be issued to the database
from the very same Hibernate session. In our current configuration, this raises an
exception immediately, because we have disabled the lazy loading of 
* LexicalEntry.lemma
* LexicalEntry.listOfComponents
* SubcategorizationFrame.lexemeProperty

Do we really need non-lazy loading of these three attributes or could we reset them
to the default mode (i.e., lazy loading)?

Original issue reported on code.google.com by chmeyer.de on 2014-10-09 08:28:34

judithek commented 9 years ago
Remove non-lazy loading.

Committed the change (revision 633).

Original issue reported on code.google.com by chmeyer.de on 2014-10-09 13:21:17