ilyakard / pacer-titan

Pacer adapter for Titan distributed graph DB
Other
5 stars 2 forks source link

Proposed patch for key indices #2

Closed dstuebe closed 8 years ago

dstuebe commented 8 years ago

Working on a project at swipely we are trying to optimize performance inserting vertices and found that we are getting some pathologic behavior with key indices

Here is some profile results from a small test case - the 3.87 seconds in Java::ComThinkaureliusTitanGraphdbBlueprints::TitanBlueprintsGraph#getIndexedKeys is a real performance killer on larger data sets.

With the Patch:

$ be ruby --profile script/insert_tickets.rb
Profiling enabled; ^C shutdown will now dump profile info
LockJar#lock has been disabled
LockJar#load has been disabled
Inserting using event Ticket
I 151213-13:24:25.878 17452_24074: [CrmKinesis::EventHandlers::Ticket] Starting worker
Opening: development
I 151213-13:24:27.320 17452_24074: [CrmKinesis::EventHandlers::Ticket] Opening graph
I 151213-13:24:27.321 17452_24074: [CrmKinesis::EventHandlers::Ticket] Inserting tickets...
I 151213-13:24:27.322 17452_24074: [CrmKinesis::EventHandlers::Ticket] Inserting tickets for merchant on 2015-11-02
I 151213-13:24:38.940 17452_24074: [CrmKinesis::EventHandlers::Ticket] Inserted 208 tickets
I 151213-13:24:38.941 17452_24074: [CrmKinesis::EventHandlers::Ticket] Committed graph
I 151213-13:24:38.943 17452_24074: [CrmKinesis::EventHandlers::Ticket] MetricsLine - dates: 2015-11-02..2015-11-02; Store: merchant; content: 208 tickets; btime: 11.62; ctime: 0.0; edges: 1705; vertices: 669; edge_properties: 3597; vertex_properties: 4266
tada!

main profile results:
Total time: 19.00

     total        self    children       calls  method
----------------------------------------------------------------
     15.45        0.06       15.39       97203  Array#each
     13.08        0.00       13.08           1  TicketInserter#insert
     13.07        0.00       13.07           1  TicketInserter#handle_messages
     11.62        0.00       11.62           1  CrmKinesis::EventHandlers::Ticket#process
     11.62        0.00       11.62        1706  Pacer::GraphTransactionsMixin.transaction
     11.62        0.00       11.62          18  Range#each_with_time_with_zone
     11.62        0.00       11.62          18  Range#each
     11.62        0.00       11.62           1  CrmKinesis::EventHandlers::Ticket#insert_store_day
     10.21        0.09       10.12       23371  Array#map
      9.19        0.01        9.18         208  CrmKinesis::EventHandlers::Ticket#create_vertices
      5.44        0.05        5.39        3751  Kernel.require
      4.90        0.07        4.83        4385  Pacer::Core::Route.each
      4.55        0.11        4.44       93495  Class#new
      3.58        0.00        3.57        1855  Enumerable.first
      3.18        0.09        3.08      190270  Kernel.send
      2.57        0.00        2.57         162  Kernel.require
      2.55        0.07        2.48       10383  Proc#call
      2.50        0.11        2.39       13806  Java::ComTinkerpopPipes::AbstractPipe#next
      2.42        0.00        2.42           1  CrmKinesis::EventHandlers::Ticket#tickets_for_store_day
      2.39        0.03        2.36       10188  Pacer::Pipes::WrappingPipe#processNextStart
      2.27        0.00        2.27           1  Bundler.setup
      2.13        0.00        2.13          39  Enumerator#each
      2.13        0.00        2.13           1  Enumerable.flat_map
      2.13        0.00        2.13           1  Aws::Resources::Collection#each
      2.13        0.00        2.13           1  Aws::Resources::Operations::HasManyOperation#all_batches
      2.03        0.01        2.02         208  CrmKinesis::EventHandlers::Ticket#find_or_create_ticket_vertex
      1.98        0.09        1.88       25671  Hash#each
      1.62        0.00        1.62           1  Bundler::Runtime#setup
      1.52        0.02        1.50        4593  Pacer::Core::Route.iterator
      1.50        0.02        1.48        1705  Guestly::VertexBuilder::VertexMethods.add_edges_to
      1.50        0.00        1.50           1  Bundler::Environment#requested_specs
      1.50        0.00        1.50           1  Bundler::Definition#requested_specs
      1.50        0.00        1.50           1  Bundler::Definition#specs_for
      1.50        0.01        1.49         614  CrmKinesis::EventHandlers::Ticket#find_or_create_item_vertex
      1.48        0.02        1.46        1705  Pacer::Core::Graph::VerticesRoute.add_edges_to
      1.46        0.00        1.46           2  Aws::Resources::Request#call
      1.44        0.00        1.44           1  TicketInserter#ticket_handler
      1.44        0.00        1.44           1  CrmKinesis::EventHandlers::Ticket#initialize
      1.44        0.00        1.44           1  Guestly.open_graph
      1.44        0.00        1.44           1  Pacer.titan
      1.44        0.00        1.44           1  Pacer::PacerGraph#initialize
      1.44        0.00        1.44           1  Pacer::PacerGraph#reopen
      1.43        0.00        1.43           2  Seahorse::Client::Request#send_request
      1.43        0.00        1.43           2  Seahorse::Client::Plugins::ResponseTarget::Handler#call
      1.42        0.00        1.42           2  Aws::Plugins::ParamConverter::Handler#call
      1.42        0.00        1.42           2  Aws::Plugins::S3SseCpk::Handler#call
      1.42        0.00        1.42           2  Seahorse::Client::Plugins::RaiseResponseErrors::Handler#call
      1.42        0.00        1.42           2  Aws::Plugins::ParamValidator::Handler#call
      1.42        0.00        1.42           2  Seahorse::Client::Plugins::Endpoint::Handler#call
      1.42        0.00        1.42           2  Aws::Plugins::UserAgent::Handler#call

Without the Patch:

$ be ruby --profile script/insert_tickets.rb
Profiling enabled; ^C shutdown will now dump profile info
LockJar#lock has been disabled
LockJar#load has been disabled
Inserting using event Ticket
I 151213-13:14:36.353 16558_24074: [CrmKinesis::EventHandlers::Ticket] Starting worker
Opening: development
I 151213-13:14:37.831 16558_24074: [CrmKinesis::EventHandlers::Ticket] Opening graph
I 151213-13:14:37.832 16558_24074: [CrmKinesis::EventHandlers::Ticket] Inserting tickets...
I 151213-13:14:37.832 16558_24074: [CrmKinesis::EventHandlers::Ticket] Inserting tickets for merchant on 2015-11-02
I 151213-13:14:52.572 16558_24074: [CrmKinesis::EventHandlers::Ticket] Inserted 208 tickets
I 151213-13:14:52.574 16558_24074: [CrmKinesis::EventHandlers::Ticket] Committed graph
I 151213-13:14:52.575 16558_24074: [CrmKinesis::EventHandlers::Ticket] MetricsLine - dates: 2015-11-02..2015-11-02; Store: merchant; content: 208 tickets; btime: 14.741; ctime: 0.0; edges: 1705; vertices: 669; edge_properties: 3597; vertex_properties: 4266
tada!

main profile results:
Total time: 22.34

     total        self    children       calls  method
----------------------------------------------------------------
     18.64        0.07       18.57       97203  Array#each
     16.24        0.00       16.24           1  TicketInserter#insert
     16.22        0.00       16.22           1  TicketInserter#handle_messages
     14.74        0.00       14.74           1  CrmKinesis::EventHandlers::Ticket#process
     14.74        0.00       14.74        1706  Pacer::GraphTransactionsMixin.transaction
     14.74        0.00       14.74          18  Range#each_with_time_with_zone
     14.74        0.00       14.74          18  Range#each
     14.74        0.00       14.74           1  CrmKinesis::EventHandlers::Ticket#insert_store_day
     13.19        0.07       13.12       23371  Array#map
     12.11        0.01       12.10         208  CrmKinesis::EventHandlers::Ticket#create_vertices
      5.60        0.05        5.55        3745  Kernel.require
      4.58        0.06        4.53        4385  Pacer::Core::Route.each
      4.57        0.10        4.46       94732  Class#new
      4.07        0.00        4.07         208  Pacer::Core::Graph::GraphIndexRoute.v
      4.04        0.01        4.04         208  Pacer::Titan::Graph#indexed_route
      3.95        0.00        3.95         624  Pacer::Titan::Graph#key_indices
      3.95        0.01        3.94         624  Pacer::PacerGraph::KeyIndices.key_indices
      3.87        3.87        0.00         624  Java::ComThinkaureliusTitanGraphdbBlueprints::TitanBlueprintsGraph#getIndexedKeys
      3.39        0.01        3.38        1855  Enumerable.first
      3.35        0.09        3.26      190270  Kernel.send
      3.19        0.05        3.14       19497  Array#select
      2.83        0.00        2.83         208  Pacer::Titan::Graph#indexed_properties
      2.63        0.00        2.63           1  CrmKinesis::EventHandlers::Ticket#tickets_for_store_day
      2.59        0.00        2.59         162  Kernel.require
      2.37        0.00        2.37          39  Enumerator#each
      2.37        0.00        2.37           1  Enumerable.flat_map
      2.37        0.00        2.37           1  Aws::Resources::Collection#each
      2.37        0.00        2.37           1  Aws::Resources::Operations::HasManyOperation#all_batches
      2.32        0.10        2.22       13806  Java::ComTinkerpopPipes::AbstractPipe#next
      2.32        0.00        2.32           1  Bundler.setup
      2.22        0.03        2.20       10188  Pacer::Pipes::WrappingPipe#processNextStart
      2.03        0.05        1.99       10382  Proc#call
      1.95        0.09        1.86       25671  Hash#each
      1.94        0.01        1.93         208  CrmKinesis::EventHandlers::Ticket#find_or_create_ticket_vertex
      1.67        0.00        1.67           1  Bundler::Runtime#setup
      1.63        0.00        1.63           2  Aws::Resources::Request#call
      1.63        0.00        1.63           2  Seahorse::Client::Request#send_request
      1.63        0.00        1.63           2  Seahorse::Client::Plugins::ResponseTarget::Handler#call
      1.62        0.00        1.62           2  Aws::Plugins::ParamConverter::Handler#call
      1.62        0.00        1.62           2  Aws::Plugins::S3SseCpk::Handler#call
      1.62        0.00        1.62           2  Seahorse::Client::Plugins::RaiseResponseErrors::Handler#call
      1.62        0.00        1.62           2  Aws::Plugins::ParamValidator::Handler#call
      1.62        0.00        1.62           2  Seahorse::Client::Plugins::Endpoint::Handler#call
      1.62        0.00        1.62           2  Aws::Plugins::UserAgent::Handler#call
      1.62        0.00        1.62           2  Aws::Rest::Handler#call
      1.53        0.00        1.53           1  Bundler::Environment#requested_specs
      1.53        0.00        1.53           1  Bundler::Definition#requested_specs
      1.53        0.00        1.53           1  Bundler::Definition#specs_for
      1.50        0.00        1.50           2  Aws::Plugins::S3BucketDns::Handler#call
      1.50        0.00        1.50           2  Aws::Plugins::S3Expect100Continue::Handler#call

An alternative to patching the default graph would be to inherit and create a MemoizedGraph so that the user can choose which one they want - though I see no need to do so.

The outstanding question is how and when to decide to invalidate the cache of indices?

ilyakard commented 8 years ago

Thanks for that. I guess Titan isn't particularly flexible towards schema changes so we may as well require an application server restart to refresh the indices cache for now.