async callback with highest performance?

bethebest0622 commented 7 months ago

Description

I want to handle http get requests in async callback method.

In my ideal imagaition, it will cost one get request latency. (because i think in best siution, it should be parallelled executed).

but the test code seems cost a serialized time, which it's equal with send requests one by one.

could you help on this? how can i achieve the best performance with blocking?

Example/How to Reproduce

#include <cpr/cpr.h>
#include <iostream>
using namespace cpr;
using namespace std;

const std::string & url = "https://api.coinex.com/v1/market/detail?market=BTCUSDT";
void execute(const std::string & u, size_t i) {
  std::shared_ptr<cpr::Session> session = std::make_shared<cpr::Session>();
  session->SetUrl(cpr::Url(u));
  timeval t;
  gettimeofday(&t, NULL);
  auto future_text = session->GetCallback([i, t](Response r) {
    timeval t1; 
    gettimeofday(&t1, NULL);
    cout << "request #" << i << " return" << " cost " << (t1.tv_sec - t.tv_sec) * 1e3 + (t1.tv_usec - t.tv_usec) / 1e3 << endl;
  }); 
}

int main(int argc, char** argv) {
  for (size_t i = 0; i < 100; ++i) {
    execute(url, i); 
  }
  while (1);
}

the first request got its return in 145ms, but the last cost 7800ms.

and one more try:

#include <cpr/cpr.h>
#include <iostream>
#include <unistd.h>
#include <mutex>
using namespace cpr;
using namespace std;

std::mutex mut_;
const std::string & url = "https://api.coinex.com/v1/market/detail?market=BTCUSDT";
std::shared_ptr<cpr::Session> session = std::make_shared<cpr::Session>();
std::vector<int> lats;
void execute(const std::string & u, size_t i) {
  timeval t;  //  t2;
  gettimeofday(&t, NULL);
  // auto future_text = session->GetCallback([i, t](Response r) {
  session->GetCallback([i, t](Response r) {
    std::lock_guard<std::mutex> lg(mut_);
    timeval t1; 
    gettimeofday(&t1, NULL);
    cout << "request #" << i << " return with" << r.text << " cost " << (t1.tv_sec - t.tv_sec) * 1e3 + (t1.tv_usec - t.tv_usec) / 1e3 << endl;
    lats[i] = (t1.tv_sec - t.tv_sec) * 1e3 + (t1.tv_usec - t.tv_usec) / 1e3;
  }); 
}

int main(int argc, char** argv) {
  lats.resize(100);
  session->SetUrl(cpr::Url(url));
  for (size_t i = 0; i < 100; ++i) execute(url, i); 
  sleep(10);
  for (size_t i = 0; i < 100; ++i) {
    cout << lats[i] << endl;
  }
}

Possible Fix

the right interface function or set correctly.

Where did you get it from?

GitHub (branch e.g. master)

Additional Context/Your Environment

OS: Centos Stream 8
Version: libcurl 8.2.1
gcc 13

carlwang99 commented 4 months ago

I looked at the CPR code and found that it uses a thread pool and poll implementation, which seems to occupy one thread per request. The thread pool has a minimum thread count of 1 and a maximum thread count equal to the number of CPU cores. If the CPU has 'n' cores, then the requests appear to be executed in groups of 'n'.