bmuller / twistar

Twistar is an object-relational mapper (ORM) for Python that uses the Twisted library to provide asynchronous DB interaction.
http://findingscience.com/twistar
Other
132 stars 38 forks source link

find() variant returning a generator? #52

Open janpascal opened 9 years ago

janpascal commented 9 years ago

Hi,

In my code I sometimes have to go thought all objects of a certain type. By using the DBObject.all() function it seems all objects are instantiated in a large array. Would it be possible to have all() or find() return a generator function, so that code like for object in Object.all(): object.doSomething() would involve only one instance of Object at a time, which could be garbage collected after each iteration in the loop?

Jan-Pascal

bmuller commented 9 years ago

Hey @janpascal - This is made a bit more complicated given the asynchronous nature of the queries. What sort of interface were you thinking of?

janpascal commented 9 years ago

Hi Brian,

I've only been using twisted for a couple of weeks, so forgive me if I think too lightly of these things...

Maybe something in the line of @inlineCallbacks

def f(): ... gen = MyObject.all(generator=True) while o = yield gen.next(): o.doSomething()

Would that be possible? So gen.next() returns a Deferred, which yields either the next object, or None

Thanks

Jan-Pascal

bmuller commented 9 years ago

That could work, but it wouldn't be able to function like a true generator. For instance, you wouldn't be able to do:

for o in MyObject.all(generator=True):
     print o.name

It could maybe be done as a class method that takes a function that is called with a list of objects at a time (like find_in_batches in ActiveRecord). That would look something like:

def print_names(olist):
     for o in olist:
          print o.name
     # may return a deferred

def done():
     print "Finished finding and printing everything"

MyObject.find_in_batches(print_names).addCallback(done)

But it would have to be done carefully to not blow out the call stack if there are a ton of calls.

janpascal commented 9 years ago

I like the find_in_batches pattern. It would suit my needs, and it looks like a clean interface, especially if you could give it an optional batch_size argument.