Closed 0cv closed 1 year ago
@0cv Thank you for your contribution!!
The automatic recursively query call is good idea.
But I think this change has the disadvantage in memory usage because query
or query_all
handle all collection data.
I want us to be able to select memory effective way (recursive call manually) or easy way (automatically) on rustforce
.
What are your thoughts on that?
@tzmfreedom I don't have such use case, using this library (or similar) for ETL tools, or data loading. Memory is not a concern but keeping logic simple and clear is a concern to me.
If you really would like to retrieve granularly records one batch at a time, a user could just use the current get
method, and then call the nextRecordsUrl
whenever it fits - in a similar way that this PR is doing, but then adding custom logic between each iteration.
This being said, at the moment querying "select id from account" will return maximum 2000 records and that's definitely a bug, that this PR is addressing
This PR has an issue though, batches are retrieved in the wrong order, I can push a commit fixing this
Thanks @0cv and @tzmfreedom for working on this.
I'll chime in with a data point that I'd like the direct access to the batches. Coming from other uses of the Salesforce API I'm expecting to need to handle it anyway. And I'd rather a library like this not eagerly make an unbounded number of serial requests before reporting any status. In particular, I don't want a spurious network error in a later batch to prevent me from making progress on earlier batches.
In simple cases I can see it being helpful, so I support the helper being available as an option.
FWIW this is the patch I've been using myself:
index 7a4056b..11a630f 100644
--- a/src/client.rs
+++ b/src/client.rs
@@ -242,6 +242,18 @@ impl Client {
}
}
+ /// Fetch additional records from a previous query
+ pub async fn query_more<T: DeserializeOwned>(&self, next_records_url: &str) -> Result<QueryResponse<T>, Error> {
+ let url = format!("{}{}", self.instance_url.as_ref().unwrap(), next_records_url);
+ let res = self.get(url, vec![]).await?;
+
+ if res.status().is_success() {
+ Ok(res.json().await?)
+ } else {
+ Err(Error::ErrorResponses(res.json().await?))
+ }
+ }
+
/// Find records using SOSL
pub async fn search(&self, query: &str) -> Result<SearchResponse, Error> {
let query_url = format!("{}/search/", self.base_path());
diff --git a/src/response.rs b/src/response.rs
index 83d3767..a39764f 100644
--- a/src/response.rs
+++ b/src/response.rs
@@ -8,6 +8,7 @@ use std::collections::HashMap;
pub struct QueryResponse<T> {
pub total_size: i32,
pub done: bool,
+ pub next_records_url: Option<String>,
pub records: Vec<T>,
}
Thanks @0cv and @tzmfreedom for working on this.
I'll chime in with a data point that I'd like the direct access to the batches. Coming from other uses of the Salesforce API I'm expecting to need to handle it anyway. And I'd rather a library like this not eagerly make an unbounded number of serial requests before reporting any status. In particular, I don't want a spurious network error in a later batch to prevent me from making progress on earlier batches.
In simple cases I can see it being helpful, so I support the helper being available as an option.
if you want to count the number of records of your query you can do something like select count(id) from account
. It will return one batch. And if you want to get the first X records of your query, you can use a limit
, e.g. select id from account limit 200
. In any case, this library implements an implicit limit which is to me an unexpected behaviour.
Closing this issue as there is no progress on it
Typically when more than 2000 records are returned, Salesforce will provide a nextRecordsUrl parameter that shall be used to retrieve the next batch.
query
andqueryAll
functions now recursively query until Salesforce tells that it's done.https://developer.salesforce.com/docs/atlas.en-us.api_rest.meta/api_rest/dome_query.htm
query
andquery_all
have been refactored into a single line function callingquery_with
which use eitherquery
orqueryAll
parameter.async-recursion
has been added to the crates which hides the implementation details (dyn Box Future etc.)However along
DeserializeOwned
, theSend
trait to ensure thread safety must now be added when querying, so this is a breaking change, albeit mostly transparent.