elpaso / django-dag

Basic portable Directed Acyclic Graph application for Django
66 stars 19 forks source link

query optimization #1

Open jsykora opened 13 years ago

jsykora commented 13 years ago

This isn't a bug or problem per se- but while doing some query optimizations in my project, I realized that the descendants_tree() (and I assume ancestor_tree and ans_set/des_set) command(s) uses a large number of queries to complete. I'm wondering if there is an easy way to cut down the number of queries for the nested structure. I had something in mind like a select_related() query that could at least cut down the number of db hits. For now, the rest of my project is too important to overlook, so it may be a while until I can get back to it and work on a pull request. Thanks for this project!

elpaso commented 13 years ago

I agree, there could be a problem with large datasets and deep trees.

I'm not going to fix it right now, but using a different approach (something like MPTT [1] does) might worth considering.

[1] https://github.com/django-mptt/django-mptt/

jsykora commented 13 years ago

Alright. I'll keep you updated if I come up with any improvements. This project is working great for my needs, thanks for your work.

hyperair commented 11 years ago

http://www.postgresql.org/docs/8.4/static/queries-with.html is an approach which would be very useful for this, and shouldn't be too hard to adapt for. However, it's not going to be portable -- MySQL and Sqlite don't support this query unless I'm mistaken.

I'm about to begin work on this. Given the non-portability of this, I'm thinking of forking this and renaming it (django-cte-dag?) so that django-dag users who need to deal with deep graphs can benefit from Postgresql's recursive queries, while django-dag can continue to serve users of other databases with simpler needs.

It's worth noting that for small to medium-sized graphs you could use transitive closures, and there's a library out there called django-directed-acyclic-graph which does that.

a1Gupta commented 7 years ago

Hi @elpaso, thanks for this package. Solves my use case with slight modifications. I want it to be scalable upto ~100K node and edges. I just read your comment -

I'm not going to fix it right now, but using a different approach (something like MPTT [1] does) might worth considering.

I am curious to know more about your approach.

Also, did anyone else work on optimizing queries?( I'm using MySQL database.)

elpaso commented 7 years ago

@a1Gupta Hi, I'm not actively working on this project anymore, but I'll be happy to review any PR, provided that it applies to general use cases of the library.

a1Gupta commented 7 years ago

@elpaso Hi, I am trying to find an efficient way to do it. I've added a post on reddit for the same. You can also add your ideas there.