pallets-eco / flask-admin

Simple and extensible administrative interface framework for Flask
https://flask-admin.readthedocs.io
BSD 3-Clause "New" or "Revised" License
5.79k stars 1.57k forks source link

Filtering Against List of Relations #501

Closed gordol closed 10 years ago

gordol commented 10 years ago

This isn't so much an issue, as much as a question/concern. What I have now is working great, and I couldn't find any examples of this anywhere, so I thought I'd share here, and possibly get some feedback.

Basically I have Users, and Artists, and Users can follow Artists. I want to be able to filter and only show Artists that are followed.

User model has a list field of references to the Artist model.

class User(db.Document, UserMixin):
...
    following = db.ListField(db.ReferenceField('Artist', reverse_delete_rule = db.PULL))
...

I'm wondering if there's a better way to go about this:

  148 class FilterArtistsFollowed(BooleanEqualFilter):
  149     def apply(self, query, value):
  150         if value == '1':
  151             res = models.User.objects.exec_js("""
  152             function(){ 
  153                 artists = []; 
. 154                 //find all artists who are followed
  155                 db.user.find().forEach(
  156                     function(user){ 
  157                         user.following.forEach(
  158                             function(oid){ 
  159                                 if(artists.indexOf(oid) === -1){ 
  160                                     artists.push(String(oid)); 
  161                                 } 
  162                             }
  163                         ); 
  164                     }
  165                 ); 
  166 
  167                 res = []
  168                 //filter artist results against followed artists
  169                 db.artist.find(query).forEach(
  170                     function(artist){  
  171                         if(artists.indexOf(String(artist['_id'])) > -1){ 
  172                             res.push(artist['_id']); 
  173                         } 
  174                     }
  175                 ); 
  176 
  177                 return res;
  178             } 
  179             """)
  180         else:
  181             res = models.User.objects.exec_js("""
  182             function(){ 
  183                 artists = []; 
  184                 //find all artists who are followed
  185                 db.user.find().forEach(
  186                     function(user){ 
  187                         user.following.forEach(
  188                             function(oid){ 
  189                                 if(artists.indexOf(oid) === -1){ 
  190                                     artists.push(String(oid)); 
  191                                 } 
  192                             }
  193                         ); 
  194                     }
  195                 ); 
  196 
  197                 res = []
  198                 //filter artist results against followed artists
  199                 db.artist.find(query).forEach(
  200                     function(artist){ 
  201                         if(artists.indexOf(String(artist['_id'])) === -1){ 
  202                             res.push(artist['_id']); 
  203                         } 
  204                     }
  205                 ); 
  206 
  207                 return res;
  208             } 
  209             """)
  210 
  211         return query.filter(id__in=res)
  212 
  213     def operation(self):
  214         return gettext('followed')

Any help or suggestions are greatly appreciated! <3

Maybe something with aggregation and/or map_reduce?

mrjoes commented 10 years ago

In relational database it is easy, in MongoDB you can also use two queries:

  1. Get list of all followed artists id arrays
  2. Merge arrays on the server
  3. Use $in query to find artist information

However, this is not scalable at all, but will work exactly how your algorithm. Probably a little bit less of code as well.

map/reduce might work, but MongoDB has certain scalability problems with it.

gordol commented 10 years ago

Yep, it's running that on the Mongo server currently. I had tried to simply concat() but it kept throwing back an empty array for some reason, so instead of digging deeper I elected to just use the indexOf() method. It should only be using two queries already... I believe, well, plus two more for the admin panel to count for pagination and de-reference all the ObjectIDs, but the reverse lookup is only two queries that are both running on the backend. I currently have 3 users, who are together following 281 artists, of 11,807 artists total. I'm developing on a Chromebook locally (MongoDB running locally too) and query time to get all followed artists is 474ms, and query time to get all non-followed artists is 581ms. This query wont' be user-facing, nor blocking... admin only, and backend async workers, so I'm not too-too concerned.

Thanks for your input! I hope this is able to help steer someone else in the right direction as well.

gordol commented 10 years ago

Worst case, later, if I have issues scaling, I'll probably just use pre-save hooks on the database to keep a flat list of followed artists up-to-date on the fly as artists are followed/unfollowed, or even better, I'll just break out the user->artist relation into it's own table/document something like the following, instead of storing the User's followed artists inside the User model.

class Followers(db.Document):
   artist = db.ReferenceField('Artist', reverse_delete_rule = db.CASCADE)
   users = db.ListField(db.ReferenceField('User', reverse_delete_rule = db.PULL))

Which should let us query very efficiently, I'd imagine.

gordol commented 10 years ago

Or I'll just move the Users/Artists into a relational DB and only use Mongo for the data from the various APIs that we're integrating. :) Anyway, thanks again.