minio / minio-cpp

MinIO C++ Client SDK for Amazon S3 Compatible Cloud Storage
https://minio-cpp.min.io/
Apache License 2.0
136 stars 56 forks source link

How do I set the timeout for minio-cpp? #125

Closed adahiro closed 7 months ago

adahiro commented 7 months ago

When I use the UploadObject(args) function provided by minio-cpp to upload files, if there is a network failure, the program will be blocked and there is no opportunity to exit and correct the error. How do I set the timeout for minio-cpp? It seems that there are no settings related to timeout and network issues in minio-cpp.

balamurugana commented 7 months ago

As minio-cpp uses libcurl which doesn't provide a proper timeout mechanism like individual read/write timeouts, it is recommended to use progressfunc to implement timeout as per your requirement.

You could also enable Debug to trace what's going on in the network for know exact issue.

kobalicek commented 7 months ago

@balamurugana

Would it make sense to incorporate this functionality?

https://curl.se/libcurl/c/CURLOPT_TIMEOUT.html

In libcurlpp this is implemented via Timeout trait:

typedef curlpp::OptionTrait<long, CURLOPT_TIMEOUT> Timeout;

I believe that CURLOPT_TIMEOUT would allow us to have the timeout supported. Any objections?

balamurugana commented 7 months ago

@kobalicek CURLOPT_TIMEOUT is overall timeout for the entire HTTP request. Practically it won't work.

Demonslay335 commented 7 months ago

@balamurugana

As minio-cpp uses libcurl which doesn't provide a proper timeout mechanism like individual read/write timeouts, it is recommended to use progressfunc to implement timeout as per your requirement.

Could you give an example on cancelling the upload from that callback? I actually see two issues:

  1. Client::UploadObject never transfers the UploadObjectArgs::processfunc into the PutObjectArgs : https://github.com/minio/minio-cpp/blob/0eb2a19df1e07a2b5c7f04e0c5c6049566fc2a12/src/client.cc#L782-L796

  2. When processfunc would be actually called further down the chain, it has no return or mechanism to override the CURL_PROGRESSFUNC_CONTINUE that is returned to curlpp: https://github.com/minio/minio-cpp/blob/0eb2a19df1e07a2b5c7f04e0c5c6049566fc2a12/src/http.cc#L405-L420

Assuming 1 is fixed, are we supposed to just throw from the callback? Or does 2 also need to be addressed to make the callback return/influence the constant?

balamurugana commented 7 months ago

@Demonslay335 (1) and (2) are bugs and will be fixed by https://github.com/minio/minio-cpp/pull/128

Once https://github.com/minio/minio-cpp/pull/128 is merged, you could return false to abort the upload/download operation.

adahiro commented 7 months ago

`UploadObjectResponse Client::UploadObject(UploadObjectArgs args) { if (error::Error err = args.Validate()) { return UploadObjectResponse(err); }

std::ifstream file; file.exceptions(std::ifstream::failbit | std::ifstream::badbit); try { file.open(args.filename); } catch (std::system_error& err) { return error::make( "unable to open file " + args.filename + "; " + err.code().message()); }

PutObjectArgs po_args(file, args.object_size, 0); po_args.extra_headers = std::move(args.extra_headers); po_args.extra_query_params = std::move(args.extra_query_params); po_args.bucket = std::move(args.bucket); po_args.region = std::move(args.region); po_args.object = std::move(args.object); po_args.headers = std::move(args.headers); po_args.user_metadata = std::move(args.user_metadata); po_args.sse = std::move(args.sse); po_args.tags = std::move(args.tags); po_args.retention = std::move(args.retention); po_args.legal_hold = std::move(args.legal_hold); po_args.content_type = std::move(args.content_type); po_args.progressfunc = std::move(args.progressfunc);

PutObjectResponse resp = PutObject(std::move(po_args)); file.close(); return UploadObjectResponse(resp); }` Hi, it seems that this function is also missing the line po_args.progress_userdata = std::move(args.progress_userdata);. There is an error when calling it.

adahiro commented 7 months ago

Based on my testing, the progressfunc is not called when it's unable to connect to the MinIO server.It's possible that I still don't understand how to use minio-cpp correctly. I tested with the commit 0C3F92E, but it doesn't seem to solve the problem. In the MinIO Java client, after a timeout, we can catch an exception, which allows us to cache the file to be uploaded and handle subsequent business and fault processing, such as delayed sending. However, in minio-cpp, if the network is unstable and we lose connection to the MinIO server, the program will be in a blocked state, and progressfunc will not be called. The client will stop working, and we don't have a convenient way to handle this kind of network exception. Under these circumstances, minio-cpp is not usable unless we can guarantee 100% that the network between the client and server is functioning normally. In other words, an anomaly in the network may cause all clients using minio-cpp to fail and stop working.

kobalicek commented 7 months ago

I think this needs changes of the implementation - using multiple-handles API instead of the easy API that is currently used. I would be happier if we used curl directly without the C++ layer that just adds a lot of overhead anyway.

I'm relatively new to this project so I cannot promise I can resolve this issue quickly, but I understand the problem and I think that blocking to infinity is just not a good design and that it should be fixed.

balamurugana commented 7 months ago

Based on my testing, the progressfunc is not called when it's unable to connect to the MinIO server.It's possible that I still don't understand how to use minio-cpp correctly. I tested with the commit 0C3F92E, but it doesn't seem to solve the problem. In the MinIO Java client, after a timeout, we can catch an exception, which allows us to cache the file to be uploaded and handle subsequent business and fault processing, such as delayed sending. However, in minio-cpp, if the network is unstable and we lose connection to the MinIO server, the program will be in a blocked state, and progressfunc will not be called. The client will stop working, and we don't have a convenient way to handle this kind of network exception. Under these circumstances, minio-cpp is not usable unless we can guarantee 100% that the network between the client and server is functioning normally. In other words, an anomaly in the network may cause all clients using minio-cpp to fail and stop working.

@adahiro I couldn't understand what kind of network issue you have between client and server. Please enable Debug and share the output.

kobalicek commented 7 months ago

I understand the problem - progress func is for monitoring progress, but if there is no progress it won't be called. So if you are stuck, waiting for bytes that are not coming, the thread is blocked for a long time / infinity and your progress func is never actually called.

balamurugana commented 7 months ago

I understand the problem - progress func is for monitoring progress, but if there is no progress it won't be called. So if you are stuck, waiting for bytes that are not coming, the thread is blocked for a long time / infinity and your progress func is never actually called.

@kobalicek Have you checked this by enabling Debug and observed the behavior? As per https://curl.se/libcurl/c/CURLOPT_TIMEOUT.html, progress function is recommended approach to have custom timeout logic.

Demonslay335 commented 7 months ago

As per https://curl.se/libcurl/c/CURLOPT_TIMEOUT.html, progress function is recommended approach to have custom timeout logic.

The issue described does make sense, given this text for the option.

While data is being transferred it is invoked frequently, and during slow periods like when nothing is being transferred it can slow down to about one call per second.

In that same vein, it's actually been deprecated as of 7.32.0, but I'm not sure which version this project actually gets linked against. Although, we all know curl never really kills features, so it's maybe a moot point. 😅 The docs for the "new" CURLOPT_XFERINFOFUNCTION are just copy/pasted from CURLOPT_PROGRESSFUNCTION anyways, so it likely is subject to the same problem.

Perhaps also exposing CURLOPT_TIMEOUT and the like would still help in conjunction to at least avoid running to infinity, which I believe is the worst-case scenario a user would want to avoid.

balamurugana commented 7 months ago

@Demonslay335 Firstly I would need to know the exact problem by investigating Debug output. The issue you are facing is mostly unrelated to the timeout.

adahiro commented 7 months ago

@Demonslay335 Firstly I would need to know the exact problem by investigating output. The issue you are facing is mostly unrelated to the timeout.Debug

This issue is relatively easy to reproduce:

Write a client test case code, set the server address and related parameters. Here we only write something related to the problem, for example: args.progressfunc = [](minio::http::ProgressFunctionArgs args) -> bool{// print something } minio::s3::UploadObjectResponse resp = client.UploadObject(args);

Start the minio server and run the above code; the file uploads successfully.

Shut down the minio server or disable the network on the machine where the client code is running (for example, by unplugging the network cable to simulate an unstable network situation). The program will be perpetually blocked at the line “client.UploadObject(args)”, and args.progressfunc will not be called, resulting in no output.

In actual tests, “args.progressfunc” is only called when the client successfully connects to the minio server.

In real-world network environments, especially with multiple clients, I cannot guarantee that my network will always be healthy. If there is a network anomaly during file upload and it is not possible to connect to the minio server, client.UploadObject(args); and other code will block my thread and will not return any information, leaving me no way to correct the issue. The client's business will become inoperable unless I restart the process.

adahiro commented 7 months ago

Additionally, I have observed that only a limited number of methods, such as "get_object", "put_object", "upload_object", and so on, support "args.progressfunc". Methods like "list_buckets" may still cause thread blocking. My understanding is that args.progressfunc is designed to track upload and download progress. Currently, minio-cpp does not offer a way to handle network exceptions, which means using minio-cpp assumes that the network and the minio server are always 100% operational. Unlike minio-cpp, minio-go and minio-java can detect when the network is down or the server is not running, and they do not block under such circumstances.

balamurugana commented 7 months ago

I tried to reproduce the issue but minio-cpp worked as expected

#include <miniocpp/client.h>

int main() {
  // Create S3 base URL.
  minio::s3::BaseUrl base_url("192.168.124.140:9000");
  base_url.https = false;

  // Create credential provider.
  minio::creds::StaticProvider provider("minioadmin", "minioadmin");

  // Create S3 client.
  minio::s3::Client client(base_url, &provider);
  client.Debug(true);

  // Create upload object arguments.
  minio::s3::UploadObjectArgs args;
  args.bucket = "my-bucket";
  args.object = "my-object";
  args.filename = "my-object.csv";

  // Call upload object.
  minio::s3::UploadObjectResponse resp = client.UploadObject(args);

  // Handle response.
  if (resp) {
    std::cout << "my-object.csv is successfully uploaded to my-object"
              << std::endl;
  } else {
    std::cout << "unable to upload object; " << resp.Error().String()
              << std::endl;
  }

  return 0;
}
  1. When server and client network are in up state

[root@master minio-cpp]# ip addr show eth0 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000 link/ether 52:54:00:de:e4:bb brd ff:ff:ff:ff:ff:ff altname enp1s0 inet 192.168.124.116/24 brd 192.168.124.255 scope global noprefixroute eth0 valid_lft forever preferred_lft forever

[root@master minio-cpp]# ./build/UploadObject

  1. When client network is down
    
    [root@master minio-cpp]# ip addr show eth0
    2: eth0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc fq_codel state DOWN group default qlen 1000
    link/ether 52:54:00:de:e4:bb brd ff:ff:ff:ff:ff:ff
    altname enp1s0
    inet 192.168.124.116/24 brd 192.168.124.255 scope global noprefixroute eth0
       valid_lft forever preferred_lft forever

[root@master minio-cpp]# ./build/UploadObject

  1. When server network is down, but client network is up
    
    [root@master minio-cpp]# ip addr show eth0
    2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:54:00:de:e4:bb brd ff:ff:ff:ff:ff:ff
    altname enp1s0
    inet 192.168.124.116/24 brd 192.168.124.255 scope global noprefixroute eth0
       valid_lft forever preferred_lft forever

[root@master minio-cpp]# ./build/UploadObject

I am closing this issue now. Please reopen with appropriate information along with Debug output.

fwosar commented 7 months ago

The problem with your test is that you created scenarios where the failure is controlled (no route to the host). A better test would be to drop the packets when trying to establish the connection, which is what would happen if a firewall blocked the connection. Then you run into the aforementioned timeout issues @adahiro mentioned.

fwosar commented 7 months ago

Just checked out the libcurl source code and unless you specify a CURLOPT_CONNECTTIMEOUT value, it will default to 300 seconds. That means, trying to connect to a server that has a firewall interfering with the connection will block your application for 5 minutes. That's pretty bad. I would suggest setting a much lower value by default (15 to 30 seconds) or exposing this setting to the consumer of the library. Keep in mind that CURLOPT_CONNECTTIMEOUT only applies to the protocol negotiation phase, which is usually the TCP/IP handshake as well as any transport encryption negotiation. It does not apply to the actual transfer of data once the connection is negotiated.

lxily commented 4 months ago

I tried to reproduce the issue but minio-cpp worked as expected

#include <miniocpp/client.h>

int main() {
  // Create S3 base URL.
  minio::s3::BaseUrl base_url("192.168.124.140:9000");
  base_url.https = false;

  // Create credential provider.
  minio::creds::StaticProvider provider("minioadmin", "minioadmin");

  // Create S3 client.
  minio::s3::Client client(base_url, &provider);
  client.Debug(true);

  // Create upload object arguments.
  minio::s3::UploadObjectArgs args;
  args.bucket = "my-bucket";
  args.object = "my-object";
  args.filename = "my-object.csv";

  // Call upload object.
  minio::s3::UploadObjectResponse resp = client.UploadObject(args);

  // Handle response.
  if (resp) {
    std::cout << "my-object.csv is successfully uploaded to my-object"
              << std::endl;
  } else {
    std::cout << "unable to upload object; " << resp.Error().String()
              << std::endl;
  }

  return 0;
}
  1. When server and client network are in up state

[root@master minio-cpp]# ip addr show eth0
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:54:00:de:e4:bb brd ff:ff:ff:ff:ff:ff
    altname enp1s0
    inet 192.168.124.116/24 brd 192.168.124.255 scope global noprefixroute eth0
       valid_lft forever preferred_lft forever

[root@master minio-cpp]# ./build/UploadObject 
* !!! WARNING !!!
* This is a debug build of libcurl, do not use in production.
* STATE: INIT => CONNECT handle 0x2b057d8; line 1948
* Added connection 0. The cache now contains 1 members
* STATE: CONNECT => CONNECTING handle 0x2b057d8; line 2001
*   Trying 192.168.124.140:9000...
* Connected to 192.168.124.140 (192.168.124.140) port 9000
* STATE: CONNECTING => PROTOCONNECT handle 0x2b057d8; line 2109
* STATE: PROTOCONNECT => DO handle 0x2b057d8; line 2139
> GET /my-bucket?location= HTTP/1.1
Host: 192.168.124.140:9000
Accept: */*
Authorization: AWS4-HMAC-SHA256 Credential=minioadmin/20240402/us-east-1/s3/aws4_request, SignedHeaders=host;x-amz-content-sha256;x-amz-date, Signature=28e6e205daf4fe668126427d6a3f5f8d132edb747584ede3a40ee6c1d8c33fe8
User-Agent: MinIO (linux; x86_64) minio-cpp/MINIO_CPP_MAJOR_VERSION.MINIO_CPP_MINOR_VERSION.MINIO_CPP_PATCH_VERSION
x-amz-content-sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
x-amz-date: 20240402T132940Z

* STATE: DO => DID handle 0x2b057d8; line 2233
* STATE: DID => PERFORMING handle 0x2b057d8; line 2351
* HTTP 1.1 or later with persistent connection
< HTTP/1.1 200 OK
< Accept-Ranges: bytes
< Content-Length: 128
< Content-Type: application/xml
< Server: MinIO
< Strict-Transport-Security: max-age=31536000; includeSubDomains
< Vary: Origin
< Vary: Accept-Encoding
< X-Amz-Id-2: dd9025bab4ad464b049177c95eb6ebf374d3b3fd1af9251148b658df7ac2e3e8
< X-Amz-Request-Id: 17C279AA3E998546
< X-Content-Type-Options: nosniff
< X-Xss-Protection: 1; mode=block
< Date: Tue, 02 Apr 2024 13:29:40 GMT
< 
* STATE: PERFORMING => DONE handle 0x2b057d8; line 2550
* multi_done[DONE]: status: 0 prem: 0 done: 0
* Connection #0 to host 192.168.124.140 left intact
* Expire cleared
* !!! WARNING !!!
* This is a debug build of libcurl, do not use in production.
* STATE: INIT => CONNECT handle 0x2b06f28; line 1948
* Added connection 0. The cache now contains 1 members
* STATE: CONNECT => CONNECTING handle 0x2b06f28; line 2001
*   Trying 192.168.124.140:9000...
* Connected to 192.168.124.140 (192.168.124.140) port 9000
* STATE: CONNECTING => PROTOCONNECT handle 0x2b06f28; line 2109
* STATE: PROTOCONNECT => DO handle 0x2b06f28; line 2139
> PUT /my-bucket/my-object HTTP/1.1
Host: 192.168.124.140:9000
Accept: */*
Authorization: AWS4-HMAC-SHA256 Credential=minioadmin/20240402/us-east-1/s3/aws4_request, SignedHeaders=content-length;content-type;host;x-amz-content-sha256;x-amz-date, Signature=322ba18641029be53b11fc0443760330f99cf4f1274fc801ef135d08dc8062cc
Content-Length: 15
Content-Type: application/octet-stream
User-Agent: MinIO (linux; x86_64) minio-cpp/MINIO_CPP_MAJOR_VERSION.MINIO_CPP_MINOR_VERSION.MINIO_CPP_PATCH_VERSION
x-amz-content-sha256: db8ae25f84f1b0b384c08289997b778745b1dda0126d42d35aed9fc07a00fa5b
x-amz-date: 20240402T132940Z

* STATE: DO => DID handle 0x2b06f28; line 2233
* STATE: DID => PERFORMING handle 0x2b06f28; line 2351
* We are completely uploaded and fine
* HTTP 1.1 or later with persistent connection
< HTTP/1.1 200 OK
< Accept-Ranges: bytes
< Content-Length: 0
< ETag: "21a8156d2669225f97395dfc226cb10b"
< Server: MinIO
< Strict-Transport-Security: max-age=31536000; includeSubDomains
< Vary: Origin
< Vary: Accept-Encoding
< X-Amz-Id-2: dd9025bab4ad464b049177c95eb6ebf374d3b3fd1af9251148b658df7ac2e3e8
< X-Amz-Request-Id: 17C279AA3EAE902F
< X-Content-Type-Options: nosniff
< X-Xss-Protection: 1; mode=block
< Date: Tue, 02 Apr 2024 13:29:40 GMT
< 
* STATE: PERFORMING => DONE handle 0x2b06f28; line 2550
* multi_done[DONE]: status: 0 prem: 0 done: 0
* Connection #0 to host 192.168.124.140 left intact
* Expire cleared
my-object.csv is successfully uploaded to my-object
  1. When client network is down
[root@master minio-cpp]# ip addr show eth0
2: eth0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc fq_codel state DOWN group default qlen 1000
    link/ether 52:54:00:de:e4:bb brd ff:ff:ff:ff:ff:ff
    altname enp1s0
    inet 192.168.124.116/24 brd 192.168.124.255 scope global noprefixroute eth0
       valid_lft forever preferred_lft forever

[root@master minio-cpp]# ./build/UploadObject 
* !!! WARNING !!!
* This is a debug build of libcurl, do not use in production.
* STATE: INIT => CONNECT handle 0x158b7d8; line 1948
* Added connection 0. The cache now contains 1 members
* STATE: CONNECT => CONNECTING handle 0x158b7d8; line 2001
*   Trying 192.168.124.140:9000...
* connect to 192.168.124.140 port 9000 failed: No route to host
* Failed to connect to 192.168.124.140 port 9000 after 3076 ms: Couldn't connect to server
* multi_done[CONNECTING]: status: 7 prem: 1 done: 0
* multi_done, not reusing connection=0, forbid=0, close=0, premature=1, conn_multiplex=0
* The cache now contains 0 members
* Curl_disconnect(conn #0, dead=1)
* Closing connection
* Expire cleared
unable to upload object; server failed with HTTP status code 0
  1. When server network is down, but client network is up
[root@master minio-cpp]# ip addr show eth0
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:54:00:de:e4:bb brd ff:ff:ff:ff:ff:ff
    altname enp1s0
    inet 192.168.124.116/24 brd 192.168.124.255 scope global noprefixroute eth0
       valid_lft forever preferred_lft forever

[root@master minio-cpp]# ./build/UploadObject 
* !!! WARNING !!!
* This is a debug build of libcurl, do not use in production.
* STATE: INIT => CONNECT handle 0x1f1c7d8; line 1948
* Added connection 0. The cache now contains 1 members
* STATE: CONNECT => CONNECTING handle 0x1f1c7d8; line 2001
*   Trying 192.168.124.140:9000...
* connect to 192.168.124.140 port 9000 failed: No route to host
* Failed to connect to 192.168.124.140 port 9000 after 3096 ms: Couldn't connect to server
* multi_done[CONNECTING]: status: 7 prem: 1 done: 0
* multi_done, not reusing connection=0, forbid=0, close=0, premature=1, conn_multiplex=0
* The cache now contains 0 members
* Curl_disconnect(conn #0, dead=1)
* Closing connection
* Expire cleared
unable to upload object; server failed with HTTP status code 0
[root@master minio-cpp]# 

I am closing this issue now. Please reopen with appropriate information along with Debug output.

Hello, I just copied your code and ran it on Windows, keeping the same URL of "192.168.124.140:9000" (of course, I didn't run the MinIo server and couldn't access it), and then the information and Debug output was only one line "* trying 192.168.124.140:9000..." and waiting forever for blocking. Could this be because of the difference between Windows and Linux?

balamurugana commented 4 months ago

@lxily Maybe. Please try curl tool or use libcurl library with demo code.