libcpr / cpr

C++ Requests: Curl for People, a spiritual port of Python Requests.
https://docs.libcpr.org/
Other
6.59k stars 937 forks source link

MultiAsync #864

Closed rule-of-3 closed 1 year ago

rule-of-3 commented 1 year ago

Here we aim to create an extension of the cpr API that introduces multiple, asynchronous, cancellable requests, an asynchronous analogue to the existing MultiGet etc. part of the API. Internally, it is implemented using cpr's ThreadPool infrastructure. The creation of the appropriate tests is also in progress. Aims to close issue #257 , and the development follows the oultine discussed in the issue's comments.

COM8 commented 1 year ago

Thanks for working on this PR!

Roughly during the next week I will have some time to look into it a bit closer and give you detailed feedback on the general way of solving this issue.

COM8 commented 1 year ago

Some general feedback:

GloriousWizard commented 1 year ago

From looking at your tests it seem that MultiAsyncGet supports tuples and not vectors which is very limiting, unless I'm mistaken. Let's say 50 URLs were created during runtime and inserted into a std::vector<cpr::Url>, how would I pass the vector to the function?

rule-of-3 commented 1 year ago
* All publicly exposed functions start with a upper case letter. Private ones with a lower case one. I know this is not consequently done throughout the lib right now, but I'm planing a PR to fix this.

Thanks for the heads up, I was a bit perprexed regarding the naming conventions used! There are some AsyncWrapper member functions that do not follow this, and this will be corrected.

* Method calls have to be camel case, not separated with an `_`.

Concerning the functions here , these are the names and signatures used for member functions of std::future, as seen here. Changing their names would make AsyncWrapper not compatible with std::future and break compatibility with previous versions of cpr. There can be aliases to these methods, like AsyncWrapper::WaitFor( ... ) in case someone would want to follow a consistent naming pattern, I suppose? Removing these methods would cause, for example, this and many other callback tests to fail.

* You define `cpr_pf_arg_t` as `double` in case libcurl < `0x072000` is used. Is this intended, since `CURL_OFF_T` is of type `int64_t `?

So, this requires some explanation, which I didn't provide in comments or commit messages, and that's my bad. Basically, this aims to work as a replacement for this preprocessor branching here, which is repeated here, for example. As I understand it, the progress function gets doubles as arguments for versions prior to the one indicated, and curl_off_t aka int64_t afterwards. Using a type alias cpr_pf_arg_t we can avoid having two declarations for the function signature and improve readability, I think. Besides, I suspect that there should also be the same conditional signature here, and the error hasn't occurred because the tests don't compile with the old version of libcurl. With the type alias, this can be corrected with minimal effort, if it's indeed a problem. I'm planning to add tests for the functionality of AsyncWrapper now, and maybe something more regarding cancellation.

rule-of-3 commented 1 year ago

From looking at your tests it seem that MultiAsyncGet supports tuples and not vectors which is very limiting, unless I'm mistaken. Let's say 50 URLs were created during runtime and inserted into a std::vector<cpr::Url>, how would I pass the vector to the function?

The extensive usage of tuples is contingent on a number of factors:

You could however, say, "why can't I give a runtime-generated std::vector<std::tuple<cpr::Url>> containing my urls as argument to the MultiOps?", which comes to the second point:

So, to Get N cpr::Urls generated at runtime, I'd personally go this way:

int main() {
  std::vector<cpr::Url> v{getUrlsFromDatabase()};
  std::vector<cpr::AsyncResponse>resps{};
  std::transform(v.cbegin(), v.cend(), std::back_inserter(resps), cpr::GetAsync<cpr::Url>);
}

The MultiOps, async or not, require compile-time awareness of the argument types and counts, the way it is now.

PS. If you know that there are going to be exactly N Urls, or you want to divide them into batches, you can do this via MultiOps, too. I'm working on how that would look like, and will post it here when I got it. It won't look super elegant, though, I suspect.

COM8 commented 1 year ago

Concerning the functions here , these are the names and signatures used for member functions of std::future, as seen here. Changing their names would make AsyncWrapper not compatible with std::future and break compatibility with previous versions of cpr. There can be aliases to these methods, like AsyncWrapper::WaitFor( ... ) in case someone would want to follow a consistent naming pattern, I suppose? Removing these methods would cause, for example, this and many other callback tests to fail.

In this case we keep the _ naming convention for obvious reasons for functions related to std. Thanks for pointing it out!

So, this requires some explanation, which I didn't provide in comments or commit messages, and that's my bad. Basically, this aims to work as a replacement for this preprocessor branching here, which is repeated here, for example. As I understand it, the progress function gets doubles as arguments for versions prior to the one indicated, and curl_off_t aka int64_t afterwards. Using a type alias cpr_pf_arg_t we can avoid having two declarations for the function signature and improve readability, I think. Besides, I suspect that there should also be the same conditional signature here, and the error hasn't occurred because the tests don't compile with the old version of libcurl. With the type alias, this can be corrected with minimal effort, if it's indeed a problem. I'm planning to add tests for the functionality of AsyncWrapper now, and maybe something more regarding cancellation.

Aha, sounds reasonable!

rule-of-3 commented 1 year ago

The bug causing test failure has been reproduced and, I believe, fixed: The reason was that in some tests from the MultiAsyncCancelTests set, the worker thread would continue executing the request shortly after the test was finished, and thus try to access mutexes or condition variables within the the observer_fn after their destructor was called (dangling references). I am altering the tests to ensure this scenario does not occur, and will commit after ensuring everything works as intended.