hmarr / mongoengine

[Moved to mongoengine/mongoengine]
https://github.com/MongoEngine/mongoengine
MIT License
795 stars 20 forks source link

Performance Issue in ComplexBaseField (v0.5) #323

Closed manasgarg closed 12 years ago

manasgarg commented 12 years ago

I have come across a performance issue in ComplexBaseField. The get method of this field always performs dereferencing even if the the field was just accessed. This puts a performance penalty for repeated access of the field.

Generally, it may not be noticeable but one of our object has significant amount of data in a ListField. Every access to that ListField takes about ~30 msec (and it used to be free till 0.4).

One specific operation in our system jumped from 8 seconds to 65 seconds after an upgrade to 0.5 as a result of this :)

Please let me know if more details are required.

Thanks

rozza commented 12 years ago

Hi @manasgarg, thanks for the report - it is something that has been raised - but if you have a testcase illustrating the data that would help.

manasgarg commented 12 years ago

Hi @rozza,

The object represents a Cricket Match. Something like:

class Inning( EmbeddedDocument):
    balls_count = IntField(default=0)
    batting_team_id = ObjectIdField()
    bowling_team_id = ObjectIdField()

    delivery_list = ListField( ObjectIdField(), default=create_list)

    inning_aggregate = EmbeddedDocumentField( InningAggregate, default=InningAggregate())
    batsman_aggregates = ListField( EmbeddedDocumentField( BatsmanAggregate), default=create_list)
    bowler_aggregates = ListField( EmbeddedDocumentField( BowlerAggregate), default=create_list)
    wicket_fall_aggregates = ListField( EmbeddedDocumentField( WicketFall), default=create_list)
    partnership_aggregates = ListField( EmbeddedDocumentField( PartnershipAggregate), default=create_list)
    player_vs_player_aggregates = ListField( EmbeddedDocumentField( PlayerVsPlayerAggregate), default=create_list)
    over_aggregates = ListField( EmbeddedDocumentField( OverAggregate), default=create_list)

    striker_id = ObjectIdField()
    non_striker_id = ObjectIdField()
    curr_bowler_id = ObjectIdField()
    prev_bowler_id = ObjectIdField()

class CricketMatch( Document):
    inning_list = ListField( EmbeddedDocumentField( Inning))
    # Several other fields go here.

Accessing inning_list in CricketMatch is what is very slow. As you'll notice, a lot of aggregates are computed and stored in the Inning object. The API call that takes quite long is the one that re-computes all aggregates.

rozza commented 12 years ago

Hi this should be massively improved as dereferencing now respects depth limits so your Innings shouldnt be automatically dereferenced anymore.

This is currently in the dev branch,

manasgarg commented 12 years ago

Thanks Ross :)

On Fri, Nov 25, 2011 at 10:08 PM, Ross Lawley < reply@reply.github.com

wrote:

Hi this should be massively improved as dereferencing now respects depth limits so your Innings shouldnt be automatically dereferenced anymore.

This is currently in the dev branch,


Reply to this email directly or view it on GitHub: https://github.com/hmarr/mongoengine/issues/323#issuecomment-2876099