Closed tiborsimko closed 10 years ago
Originally by Raquel Jimenez Encinar raquel.jimenez.encinar@cern.ch on 2011-12-08
In [95e3d250ee142a73ac64c1be66a7222d330206f3]:
#CommitTicketReference repository="" revision="95e3d250ee142a73ac64c1be66a7222d330206f3"
dbquery: add option with_dict to run_sql()
- Enriches run_sql() with a new option called
with_dict that returns directly list-of-dictionaries
instead of tuple-of-tuples. (closes #830)
* Adds 4 regression test cases.
* Changes with_desc parameter type from integer to boolean
in run_sql and run_sql_with_limit in order to
keep coherence with the with_dict parameter.
* Updates miscutil-dbquery.webdoc.
Originally on 2011-12-22
MySQLdb has another type of cursor named DictCursor that returns dictionaries, thus removing the additional step of building the dictionary. It might be worth looking into it as it shows a significant speed difference.
In [10]: %time res = test_with_dictcursor("SELECT * FROM bibrec_bib03x LIMIT 1000000")
CPU times: user 4.53 s, sys: 0.02 s, total: 4.56 s
Wall time: 5.00 s
In [11]: %time res = invenio.dbquery.run_sql("SELECT * FROM bibrec_bib03x LIMIT 1000000", with_dict=True)
CPU times: user 10.34 s, sys: 0.14 s, total: 10.48 s
Wall time: 10.90 s
See this blog post for an example.
Originally
0) Prelude and motivation.
run_sql()
currently returns tuple-of-tuples:The returned values are usually transformed intonamed symbols in the business logic, as it would not be very readable to work withposition of entities in the resulting tuple representing rows.
One can transform tuple-of-tuples into list-of-dictionaries from the get go, via:
This format of SQL results is especially nice for the forthcoming migration to Jinja templates, because one will then be able to use symbols in templates easily, see
collection.name
and friends in the following example:It is useful to generalise this technique of using SQL results throughout Invenio, but the
dict()
boilerplate code should then be eliminated.-1)* The goal of this ticket is therefore to enrich
run_sql()
with a new option called saywith_dict=True
that would return directly list-of-dictionaries instead of tuple-of-tuples. This will enable programmers to write elegantly:instead of currently convoluted:
The
with_dict
option would be set toFalse
by default, for backwards compatibility, and for use cases when speed difference may be important. Otherwise, once introduced, the majority ofrun_sql()
callers should probably switch to using it, for better code readability.Note that the new option
with_dict
would behave somewhat similarly to how currentwith_desc
option behaves, but it would return more directly exploitable results.-2)* Beware of SQL queries like:
or:
or (even though the following technique is bad style):
when constructing names of keys of the resulting dictionaries representing rows.
-3)* Extensive regression test cases covering the above examples should be naturally added.
-P.S.* We can also inspire ourselves from how
tornado.database
DB wrapper behaves in this respect, see [[wiki:Talk/WebFrameworks#a4.6.Tornado]].