chicks / sugarcrm

A ruby based REST Client for SugarCRM
MIT License
90 stars 64 forks source link

Eager loading? #52

Open openhealth opened 13 years ago

openhealth commented 13 years ago

Hello,

I was wondering if there was a way to speed up requests? I will give an example...

mysugar::Contacts.all.each do |contact| puts "contact id: #{contact.id}" end

This works fast. Looks like only one request is done. But if I change the puts statement to be this:

puts "account id: #{contact.account_id}"

Then it looks like it does a separate request for each contact that is returned, because account_id is referring to a different model. The result is very slow. I also tried changing the request to be this:

mysugar::Contacts.all do |contact|

But the result was the same. In rails (active record) there is a thing called 'eager loading' which I tried but it didn't seem to make a difference. Is there any way to make the above request faster?

openhealth commented 13 years ago

Actually that is a bad example. The above actually does work fast. Here is a better example:

mysugar::Contacts.all do |contact| contact.accounts.each do |account| puts "account name is #{account.name}" end end

The above will go slow.

chicks commented 13 years ago

No eager loading at this time. We could probably implement it if you wanted, but I haven't had a use for it yet :)

Passing a block to a collection instead of using .each will result in smaller sets of records being fetched at a given time and can save you a ton of memory.

openhealth commented 13 years ago

Implementing that would be awesome :) I guess for the time being I could use the direct API method with get_entry_list...

davidsulc commented 13 years ago

Hi,

You might want to try using blocks, like this:

SugarCRM::Contacts.all{|c|
  c.accounts.each{|a|
    puts "account name is #{a.name}"
  }
}

The syntax is a little peculiar, so I'll clarify it here: in the first block, we're calling a finder so we can directly pass it a block (notice the absence of each). This isn't quite syntactic sugar: each will fetch the entire data set, then iterate on it, whereas if you pass a block directly each queried result set is yielded to the block. Therefore, in the second case, in addition to avoiding timeout errors, you'll start processing the data sooner. (If you want to read more about that, I've written a blog post about it here: http://davidsulc.com/blog/2011/04/03/ruby-gem-for-sugarcrm-the-basics/)

In the second block, we're querying an association, so we can't directly pass a block to it. Hence the use of each.

I'm not entirely sure eager loading will even be possible to implement: the API will only let us specify one module to query, so in order to eager load associations, we'd have to perform the same number of queries. E.g. once to get the contact info (e.g. the id attribute), then another query to load the associated accounts.

One thing we might be able to do is to fetch all the accounts linked to contacts (i.e. WHERE contact_id IN (list of ids)). But this could be really nasty on big data sets...

openhealth commented 13 years ago

Thanks for the reply.

The eager loading should be ok.. if you do something like this: SugarCRM::Contacts.find(:all, :include => :accounts_contacts) do |c| c.each{|a| puts "account name is #{a.name}" end end

Then in the API it just translates to something like this:

get_entry_list( $session_id, 'Accounts', '', '', [], # all account fields [ { 'name' => 'accounts_contacts', 'value' => [ # all of the contact fields ], } ] )

That way you'd get all required data with just one call.

As a side note, I am trying this: accounts = @sugar::connection.get_entry_list( 'Contacts', '', { :link_fields => [ { 'name' => 'accounts_contacts', 'value' => ['id','name', 'account_type','portal_name_c'], } ] } )

But the result set doesn't include the link field set (using sugar pro Version 6.1.3 (Build 5640)). This is probably related to issue #51.

openhealth commented 13 years ago

Sorry for the poor formatting!

openhealth commented 13 years ago

Actually maybe the bug is not related to issue #51, because note 51 says that it's only sugar 6.2 that had the problem, but this particular one is for version 6.1. I can file a new bug report if you want?

My code: accounts = @sugar::connection.get_entry_list( 'Contacts', '', { :link_fields => [ { 'name' => 'accounts_contacts', 'value' => ['id','name', 'account_type', 'portal_name_c'], } ] } )

JSON request:

get_entry_list: Request: { "session": "jjeirthh6b5hekpempojklep51", "module_name": "Contacts", "query": "", "order_by": "", "offset": "", "select_fields": ["assigned_user_id","assistant_phone","phone_home","date_modified","alt_address_postalcode","accept_status_id","team_set_id","picture","description","email1","title","primary_address_city","assigned_user_name","alt_address_state","reports_to_id","email2","report_to_name","primary_address_state","department","cec_bcse_c","assistant","sync_contact","email_and_name1","team_name","accept_status_name","phone_mobile","c_accept_status_fields","account_name","phone_other","alt_address_country","modified_by_name","last_name","designer_c","alt_address_city","account_id","name","do_not_call","date_entered","team_count","m_accept_status_fields","birthdate","swh_installer_c","created_by_name","phone_fax","alt_address_street_2","campaign_name","deleted","salutation","alt_address_street_3","primary_address_street_2","invalid_email","primary_address_street_3","email_opt_out","electrician_c","modified_user_id","created_by","first_name","opportunity_role_fields","team_id","opportunity_role_id","opportunity_role","lead_source","full_name","primary_address_country","primary_address_street","primary_address_postalcode","alt_address_street","phone_work","id","campaign_id"], "link_name_to_fields_array": [{"name":"accounts_contacts","value":["id","name","account_type","portal_name_c"]}], "max_results": "", "deleted": 0 }

Response:

get_entry_list: JSON Response: {"entry_list"=> [{"name_value_list"=> {"primary_address_postalcode"=> {"name"=>"primary_address_postalcode", "value"=>""}, "primary_address_street"=> {"name"=>"primary_address_street", "value"=>""},

"assistant"=>{"name"=>"assistant", "value"=>""}, "picture"=>{"name"=>"picture", "value"=>""}}, "id"=>"763be9fd-060a-65d2-cca5-4da52e11d27b", "module_name"=>"Contacts"}], "next_offset"=>4, "result_count"=>4, "relationship_list"=> [[{"name"=>"accounts_contacts", "records"=>[]}], [{"name"=>"accounts_contacts", "records"=>[]}], [{"name"=>"accounts_contacts", "records"=>[]}], [{"name"=>"accounts_contacts", "records"=>[]}]]}
openhealth commented 13 years ago

Actually it looks like the direct API method is working, I may have my module link field name mixed up (that's the thing with sugar - it's hard to tell what link name it is. Usually the vardefs tell you but not so with accounts <-> contacts). Am doing more testing.

openhealth commented 13 years ago

Ah the direct API method of get_entry_list works slightly different than sugar's get_entry_list. The gem won't return all data in the :link_fields result set (until you try to access it), but sugar's one will. Am not saying that this is right or wrong but just pointing out the difference.

chicks commented 13 years ago

You can query the relationship fields (to figure out what to pass in get_entry) with:

SugarCRM::Contact.new.associations.each {|a| puts a.target.to_s + ": " + a.link_field + " => " + a.proxy_methods.join(', ')}

I'll look into adding the eager loading functionality - I envision it would work something like this:

SugarCRM::Contact.all(:include => [:created_by])

Where :include references one of the proxy methods in the association array.

chicks commented 13 years ago

Yeah, I don't think we parse the contents of the :link_fields result set right now. That would have to be added as part of this. If you feel like getting your hands dirty, take a look at the to_obj method in https://github.com/chicks/sugarcrm/blob/master/lib/sugarcrm/connection/response.rb

That's where we parse the response and try to convert it into an object. You would need to modify that, and also update the finder methods to accept the :include option and just set the appropriate option on get_entry, get_entry_list, get_entries.

openhealth commented 13 years ago

OK no worries. I'll take a look but can't guarantee anything. I had a quick look just now and it seems that the result set is not returning the link fields which means (as you said) get_entr*() needs to be changed to return the linked records as well.

davidsulc commented 13 years ago

Heh. It looks like you're right about the API call (I should've paid more attention).

I've marked this as a feature request, so we don't forget about it (I can't get to this right now). (Chicks, whoever gets started working on it can assign it to himself.)

In the meantime, I'd be glad to assist you if you want to implement it yourself. The codebase can seem a little intimidating when you're new to it, so if you have questions about the code feel free to ask.

chicks commented 13 years ago

Yeah, for the record I'm not trying to con you into making code changes - you just seem knowledgeable enough that it might be an interesting opportunity :)

davidsulc commented 13 years ago

Yup. +1 on the "we're not con men" ;-)

openhealth commented 13 years ago

No probs :) I have am finishing off work for a client that needs to be done by the end of next week (basic functionality anyway) so I will take a look at it after that.

chicks commented 13 years ago

Sweet! Let us know if you need help with the code

openhealth commented 13 years ago

I've changed the code of the to_obj method slightly so that it iterates over the response["relationship_list"] array and assigns it to a "relate" array.

So that means that you can pass the link_fields as per normal and access the data via the module's name.relate array. However you still need to pass in the fields you would like returned to link_array.

Hmmmm actually just re-reading your above post about the proxy methods, that's nice that you can get the proper relationship name for the modules. So I think it would be good to do something like...

contacts = SugarCRM::Contact.all(:include => [:Call])

Which can then look up the name of the relationship and query it, and then the to_obj method can organise the data to be available as contacts.calls (which is an array). The thing is, I don't really have much experience with all this! I think I should rewrite what I did so that the above works, because as it is just now, it's just the link_fields that are returned properly. I'm not sure if I'd have to re-do what I have done if I was to make eager loading work. I will see if I get time to do it. It look me a bit of time to figure out what to do in the first place :)

If you want me to email you the changes then just let me know. Thanks.

chicks commented 13 years ago

The best way is to open a pull request! That way you get credit for the changes and myself or David can merge them super easy.

Also, let me know if you want a quick overview of the request/response cycle. I don't want you spending more time on this than you have to :)

davidsulc commented 13 years ago

Sweet! I definitely second the pull request, it makes merging super easy. If you need an overview, there's help forking a repository here http://help.github.com/fork-a-repo/ and opening a pull request here http://help.github.com/pull-requests/

Based on your work, it shouldn't be too big a leap to implement a syntax SugarCRM::Contact.all(:include => [:Call]) (aside from the problems caused by issue #44).

openhealth commented 13 years ago

Sorry for late reply. Have been super busy! OK I'll see if I can do this today some time.

openhealth commented 13 years ago

I am sorry again for the late answer, I have been away. I've finally created a pull request. I've been so busy lately and will probably continue to be busy so will see if I can continue this, but just not sure when I can do it. Anyway will see how I go.

chicks commented 13 years ago

No worries, we're all busy :)

I'll take a look at your code in the next few days. Thanks for the pull request!