Closed synfinatic closed 12 years ago
Finally figured out the PG driver issue and was able to do some testing. Seems like it has the same basic problem, but not quite as bad. ~50sec to load 890 records when using the psql tool takes a fraction of a second. The only good news here is that AR actually caches the result so subsequent queries are very fast (it doesn't do that for Oracle).
I'm happy to try more tests to diagnose the root cause if you think of anything.
Thanks, Aaron
PG config, test script and profile:
rtg: adapter: jdbc driver: org.postgresql.Driver url: jdbc:postgresql://xxxxx/rtg encoding: unicode username: xxxx password: xxxxx host: xxxxx
require File.dirname(FILE) + '/../config/environment'
a = Rtg::Router.all puts a.size
exit 0
jruby --profile ./script/test2.rb Profiling enabled; ^C shutdown will now dump profile info 889
main thread profile results: Total time: 72.03
total self children calls method
54.40 0.04 54.37 7819 Class#new
54.11 0.06 54.05 26253 Kernel#send
52.65 0.07 52.58 3047 Proc#call
51.22 0.00 51.22 2300 BasicObject#__send__
50.77 0.12 50.65 577822 Hash#[]
50.42 0.00 50.42 1 ActiveRecord::Base.all
50.42 0.00 50.42 1 ActiveRecord::FinderMethods#all
50.42 0.00 50.42 1 ActiveRecord::Relation#to_a
50.22 0.02 50.20 17480 Hash#default
50.13 0.00 50.13 4 ActiveRecord::ConnectionAdapters::ConnectionPool#with_connection
50.13 0.00 50.13 1 ActiveRecord::Locking::Optimistic::ClassMethods#locking_enabled?
50.13 0.00 50.13 1 ActiveRecord::Base.columns_hash
49.46 0.00 49.46 46 MonitorMixin#mon_synchronize
49.46 0.00 49.46 7 ActiveRecord::ConnectionAdapters::ConnectionPool#connection
49.46 0.00 49.46 2 ActiveRecord::ConnectionAdapters::ConnectionPool#checkout
49.46 0.00 49.46 2 Kernel#loop
49.46 0.00 49.46 1 ActiveRecord::ConnectionAdapters::ConnectionPool#checkout_new_connection
49.46 0.00 49.46 1 ActiveRecord::ConnectionAdapters::ConnectionPool#new_connection
49.46 0.00 49.46 1 ActiveRecord::Base.jdbc_connection
49.46 0.00 49.46 1 ActiveRecord::ConnectionAdapters::JdbcAdapter#initialize
48.90 0.00 48.90 1 ActiveRecord::ConnectionAdapters::JdbcConnection#initialize
48.58 46.30 2.28 1 ActiveRecord::ConnectionAdapters::JdbcConnection#set_native_database_types
21.51 0.18 21.32 4475 Kernel#require
20.05 0.00 20.05 152 Kernel#require
18.22 0.33 17.88 14245 Array#each
8.30 0.03 8.27 999 ActiveSupport::Dependencies#new_constants_in
8.26 0.01 8.25 997 ActiveSupport::Dependencies::Loadable#load_dependency
6.82 0.01 6.81 986 ActiveSupport::Dependencies::Loadable#require
5.84 0.00 5.84 1 Bundler.require
5.84 0.00 5.84 1 Bundler::Runtime#require
5.33 0.00 5.33 2 Bundler.setup
4.62 0.00 4.62 1 Bundler::Runtime#setup
3.94 0.00 3.94 2 Rails::Railtie::Configurable::ClassMethods#method_missing
3.94 0.00 3.94 1 Rails::Application#initialize!
3.94 0.00 3.94 1 Rails::Initializable#run_initializers
3.78 0.00 3.78 66 Rails::Initializable::Initializer#run
3.78 0.01 3.77 68 BasicObject#instance_exec
3.45 0.00 3.45 1 Bundler::Environment#requested_specs
3.45 0.00 3.45 1 Bundler::Definition#requested_specs
3.45 0.00 3.45 1 Bundler::Definition#specs_for
3.44 0.01 3.44 115 Array#reverse_each
3.39 0.00 3.39 1 Bundler::Definition#specs
3.27 0.01 3.26 3 Bundler::Definition#resolve
2.93 0.02 2.91 2269 Kernel#catch
2.44 0.01 2.43 13 Kernel#load
2.35 0.00 2.35 11 ActiveSupport::Dependencies::Loadable#load
2.23 0.00 2.23 1 ActiveRecord::ConnectionAdapters::JdbcTypeConverter#choose_best_types
2.02 0.00 2.02 1 Bundler::Resolver.resolve
2.02 0.00 2.02 1 Bundler::Resolver#start
2.02 0.00 2.02 86 Bundler::Resolver#resolve
I'm a bit unclear. Are you saying ARJDBC is significantly slower than ActiveRecord in MRI, or just that using ActiveRecord (and you happen to be using ARJDBC as your adapter of choice) is slower than you'd expect it to be?
The latter. I haven't compared it vs MRI, but I have against non-ruby JDBC applications such as DbVisualizer, Perl DBI applications and the cli tools so I know it's not the Oracle DB/PG databases.
I could prolly whip a MRI backed script to compare if you felt that was necessary.
Just be aware that a tiny script like this is going to be significantly slower than the performance you might expect, since Java performance relies heavily upon runtime profiling and (re)compilation. For instance, the following script (with some setup elided) shows that successive runs get much faster:
def time start = Time.now yield puts "Time elapsed: #{Time.now - start}" end def run_test time do a = User.all puts a.size end end run_test run_test run_test
1001 Time elapsed: 0.935 1001 Time elapsed: 0.212 1001 Time elapsed: 0.19
Granted, 60+ seconds for less than 1000 records is definitely unexpected, but I can't reproduce that here. Can you share a schema.rb that is exhibiting this sort of poor performance?
I don't have a schema.rb- I'm using AR to interface with existing DB's which weren't originally a part of any rails app. Ncp::Switch is actually just a view so that AR doesn't have to worry about any joins/etc (I'm using the Oracle database as a read-only data source, so this works well).
I'm aware that jruby can be slower initially. I've seen that with my jruby/rails application where I'm querying Cassandra via Hector to build RRD graphs via RRD4J (two java libraries). However, it's multiple orders of magnitude faster then I'm seeing here with JDBC/AR.
For comparison sake, I ran the code in my test.rb (Oracle driver) in a loop, much like you have above (I used Benchmark.measure) and got the following results:
749 Test #1: 95.447000 0.000000 95.447000 ( 95.447000) 749 Test #2: 89.303000 0.000000 89.303000 ( 89.302000) 749 Test #3: 89.563000 0.000000 89.563000 ( 89.563000) 749 Test #4: 89.372000 0.000000 89.372000 ( 89.372000) 749 Test #5: 89.466000 0.000000 89.466000 ( 89.466000)
So yeah, there's an improvement, but still too painful to be useful.
Again, I can only assume that it has something to do with the code in your particular model. It's quite odd to me that set_native_database_types is taking the bulk of the time in the most recent profile you posted. Anything you're able to share about the schema (in any format, SQL is fine) and/or some sample data would be helpful.
Trimmed a little out of the classes for readability and also realized I lied- there are joins, but confirmed that it didn't affect performance.
ncp_database: adapter: jdbc driver: oracle.jdbc.OracleDriver url: jdbc:oracle:thin:@somehost:1521:ncpdev username: xxxxxx password: xxxxxx
class Ncp::Base < ActiveRecord::Base self.abstract_class = true establish_connection LOCAL_CONFIG['ncp_database'] end
module Ncp class Switch < Ncp::Base
set_sequence_name 'ENTITIES_SEQ'
before_create :set_primary_id
private
def set_primary_id
#why? 'switches' is a view; the oracle connection adapter gets confused
#by this and thinks it cannot fetch a primary key sequence value
#instead it tries to use a RETURNING INTO
#prefetching isn't configurable at all, so we just do it via a
#before_create hook
self.id = connection.next_sequence_value(self.class.sequence_name)
end
end end
Trying to figure out a way to dump the schema for the view in a readable format... my oracle skills are weak.
ah, this works: http://synfin.net/misc/ncp_switches.html
Well, I'm still working on this, but no big wins yet. It's obviously related to querying a view rather than a table (if I access an identical schema directly via the table, it returns within a couple seconds), so if you could get some meaningful data via liberal use of set_table_name in ActiveRecord, that may be an work-around for the time being.
So i finally figured out the issue. Since this is querying a view rather then a normal table, AR is trying to figure out the primary key by asking the DB. The answer is nothing which isn't cached and so it generates the same query for each and every record retrieved.
Adding "set_primary_key :id" to my model fixed the issue.
Yikes. Ok. Closing this as an application issue, hopefully this will help others.
I tried testing this also with Postgres, but for some reason couldn't get the driver to connect to a remote server (kept trying to use localhost).
Anyways, here's the profile:
!/usr/bin/env jruby
require File.dirname(FILE) + '/../config/environment'
a = Ncp::Switch.all puts a.size
exit 0
jruby --profile ./script/test.rb Profiling enabled; ^C shutdown will now dump profile info 749
main thread profile results: Total time: 117.61
List of installed gems via 'bundle list':