datajoint / datajoint-python

Relational data pipelines for the science lab
https://datajoint.com/docs
GNU Lesser General Public License v2.1
169 stars 84 forks source link

unnecessary SQL subquery #349

Open dimitri-yatsenko opened 7 years ago

dimitri-yatsenko commented 7 years ago

When displaying a table in Jupyter, the underlying query contains an extra query that significantly slows down the performance.

For example, previewing the table

edgy.Mesh.Fragment() & 'boss_vset_id=27754605'

produces the following query

SELECT * FROM (
    SELECT `boss_vset_id`,`fragment`,`bound_x_min`,`bound_x_max`,`bound_y_min`,`bound_y_max`,`bound_z_min`,`bound_z_max`,`n_vertices`,`n_triangles`,'=BLOB=' as `vertices`,'=BLOB=' as `triangles` 
        FROM `microns_em`.`_mesh__fragment`) as `_s4` 
    WHERE (boss_vset_id=27754605) LIMIT 8

The extra SELECT * FROM( ... ) is unnecessary and, unfortunately, MySQL cannot optimize it. With tens of millions of tuples, this query takes minutes instead of milliseconds.

guzman-raphael commented 3 years ago

Possibly addressed in PR #754. @ixcat to confirm.