yatish27 / salesforce_bulk_api

A Ruby client for Salesforce's bulk API
MIT License
77 stars 97 forks source link

Use csv format to download batches. #39

Open thomasdziedzic opened 9 years ago

thomasdziedzic commented 9 years ago

In the job file: https://github.com/yatish27/salesforce_bulk_api/blob/master/lib/salesforce_bulk_api/job.rb#L205

You are using xml to download batches of large data sets. This is unideal since salesforce allows you to specify csv [0] as the type.

This would have the following pros:

  1. Simplifying the parsing code
  2. Smaller download size
  3. Less cpu time spent parsing
  4. Less memory used trying to hold the xml

[0] - http://www.salesforce.com/us/developer/docs/api_asynch/ under Bulk Query -> Bulk Query Details

yatish27 commented 9 years ago

I guess it would also simplify the the parsing

hoffmanilya commented 9 years ago

The performance gains from switching to CSV are substantial. In some simple bench marking I found that querying 25k sObjects takes ~83% less time (14s vs. 86s) and consumes ~95% less memory (29MB vs 554MB).

yatish27 commented 9 years ago

Can you raise a PR for csv

On Sat, Aug 15, 2015 at 5:44 PM, Ilya Hoffman notifications@github.com wrote:

The performance gains from switching to CSV are substantial. In some simple bench marking I found that querying 25k sObjects takes ~83% less time (14s vs. 86s) and consumes ~95% less memory (29MB vs 554MB).

Reply to this email directly or view it on GitHub: https://github.com/yatish27/salesforce_bulk_api/issues/39#issuecomment-131452011