peeyush-tm / django-cube

Automatically exported from code.google.com/p/django-cube
GNU General Public License v3.0
1 stars 0 forks source link

Optimizing measures query #27

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
The database is hit each time a measure is queried, this must be optimized, 
either by optimizing the database queries, or by caching the data.

Original issue reported on code.google.com by seb...@gmail.com on 29 Sep 2010 at 1:50

GoogleCodeExporter commented 9 years ago

Original comment by seb...@gmail.com on 22 Oct 2010 at 11:59

GoogleCodeExporter commented 9 years ago
I've used Django Cube to do some statistics on a web site log table. I find it 
extremely convenient and flexible, but I fear that the sheer amount of database 
queries it is doing will rule it out very soon. Having a look at the sql 
statements, they seem quite inefficient - have you got any plans for revising 
them? GROUP BY statements could be used to reduce the number of queries, and 
that alone would be a major speed enhancement. If I manage to get some spare 
time, I'll try my hand at a patch: your project is so comfortable that I would 
be a real shame to let it down for performance reasons.

Original comment by antonio.galea@gmail.com on 31 Jan 2011 at 3:04

GoogleCodeExporter commented 9 years ago
Hi Antonio,

I don't know if you have received my message, so I re-post it here :

Hi Antonio !

First of all, thanks for this very positive feedback !

I am off django-cube at the moment, but I plan to come back to it when I'll be 
on a future project that requires it (project that I will start only when my 
current one will be completed, which should be in one/two month). I don't know 
if you have seen, but there is a google group for django-cube : 
http://groups.google.com/group/django-cube, and I think it that is the place to 
discuss this !
I have indeed planed for a long time to refactor django-cube in order to have 
something that is actually efficient. My first priority for this project was to 
have something that is easy and nice to use. Now that it is done, I have 
actually very big plans, and too few time to accomplish them. First thing that 
I will do is a complete refactoring. I would like django-cube to provide 
several Cube classes (which won't be cubes anymore, in order to stick better 
with django orm) with same user-interface, but that do different things under 
the hood :
    - one would do the same as now : dull generation of statistics (still needed because that's the simplest, and it can't fail)
    - one would factorize the queries (using some GROUP BY like you suggested)
    - the last one (probably the most efficient and quite simple to realize), would actually store the measures along with their dimensions in the database (and measures would be computed periodically by a crontab for example).
Second thing I plan to do is a chart library for django based on django-cube.

 I would be - of course - very happy to get your help on any of those tasks. If you actually wish to help, I can probably manage some time to work on it as well. That way we could probably get things done quite fast.

Cheers,

Sébastien

Original comment by seb...@gmail.com on 2 Feb 2011 at 12:16

GoogleCodeExporter commented 9 years ago
Hello Sébastien,
nice to hear that you have big plans for DjangoCube: it is
really a nice addition to the tools I've used for my Django
powered apps. As for the lack of time for realizing everything,
no worries: you are in pretty good company there :-)

I hope to find some time for this one, since I fear I will
need it soon.

This one is an easy approach, even if it is probably overkill for many
projects. The only drawback I see is that this way you have no real-time
data.

That's a very interesting idea - here I might help you a bit more. In
the past I've been using the most different approaches: from server
side ones like RRDTools, jpGraph and Google Charts, to client
rendered graphs like OpenFlashCharts, gRaphael and Flot... and
even custom plotting code in PHP/GD, Python/PIL and JS.
A generic Django solution leveraging some JS charting library would
be a nice project in itself, BTW.

Knowing that I won't be replicating your efforts is a first step;
I'll try and get some work done in query grouping area, then
report back to you.

Thanks a lot for your ideas, code and collaboration!

Antonio

Original comment by antonio.galea@gmail.com on 2 Feb 2011 at 10:42

GoogleCodeExporter commented 9 years ago
Hello Sébastien,
nice to hear that you have big plans for DjangoCube: it is
really a nice addition to the tools I've used for my Django
powered apps. As for the lack of time for realizing everything,
no worries: you are in pretty good company there :-)

I hope to find some time for this one, since I fear I will
need it soon.

This one is an easy approach, even if it is probably overkill for many
projects. The only drawback I see is that this way you have no real-time
data.

That's a very interesting idea - here I might help you a bit more. In
the past I've been using the most different approaches: from server
side ones like RRDTools, jpGraph and Google Charts, to client
rendered graphs like OpenFlashCharts, gRaphael and Flot... and
even custom plotting code in PHP/GD, Python/PIL and JS.
A generic Django solution leveraging some JS charting library would
be a nice project in itself, BTW.

Knowing that I won't be replicating your efforts is a first step;
I'll try and get some work done in query grouping area, then
report back to you.

Thanks a lot for your ideas, code and collaboration!

Antonio

Original comment by antonio.galea@gmail.com on 3 Feb 2011 at 8:45