kubernetes-client / c

Official C client library for Kubernetes
Apache License 2.0
146 stars 46 forks source link

Considering multi-thread environment #34

Closed clearday4 closed 3 years ago

clearday4 commented 3 years ago

curl_global_cleanup() and curl_global_init() can crash the CURL handle of other threads when making a API request from multi-thread respectively. In the multi-thread environment, you need to do curl_easy_cleanup() instead of curl_global_cleanup(), and curl_global_init() also should be done only the first time. curl_easy_init() will also execute if curl_global_init() was not done, so curl_global_init() seems not to have to be done. please consider about it.

ityuhui commented 3 years ago

Thank you for your catching and suggestion. I will take a look.

ityuhui commented 3 years ago

I think libcurl does not suggest using curl_easy_init to call curl_global_init in multi-thread environment.

In https://curl.haxx.se/libcurl/c/curl_easy_init.html

If you did not already call curl_global_init, curl_easy_init does it automatically. This may be lethal in multi-threaded cases, since curl_global_init is not thread-safe, and it may result in resource problems because there is no corresponding cleanup. You are strongly advised to not allow this automatic behaviour, by calling curl_global_init yourself properly.

ityuhui commented 3 years ago

In https://curl.haxx.se/libcurl/c/libcurl.html

libcurl requires curl_global_init and curl_global_cleanup must be called when no other thread in the program is running. (even though the thread does not use libcurl)

In kubernetes-client/c

curl_global_init is called by apiClient_create_with_base_path curl_global_cleanup is called by apiClient_free curl_easy_init and curl_easy_cleanup are called by apiClient_invoke

So user's code should call apiClient_create_with_base_path and apiClient_free in the main thread of program.

brendandburns commented 3 years ago

Perhaps we should add a static boolean check so that those functions only get called once?

ityuhui commented 3 years ago

Hi @brendandburns

The 2 functions apiClient_create_with_base_path(callcurl_global_init) and apiClient_free(callcurl_global_cleanu) should not get called when there are more than 2 threads in the program. This cannot be guarded by a static boolean check.

apiClient_invoke/curl_easy_init/curl_easy_cleanup executes without such restriction.

User can create ( and delete ) the apiClient in main thread and then create worker threads, pass the pointer of apiClient to the worker threads to operate kubernetes resources in multi-thread program.

clearday4 commented 3 years ago

I think libcurl does not suggest using curl_easy_init to call curl_global_init in multi-thread environment.

In https://curl.haxx.se/libcurl/c/curl_easy_init.html

If you did not already call curl_global_init, curl_easy_init does it automatically. This may be lethal in multi-threaded cases, since curl_global_init is not thread-safe, and it may result in resource problems because there is no corresponding cleanup. You are strongly advised to not allow this automatic behaviour, by calling curl_global_init yourself properly.

Yes. but anyway, libcurl does a boolean check if curl_global_init was done or not.

200 static CURLcode global_init(long flags, bool memoryfuncs) 201 { 202 if(initialized++) 203 return CURLE_OK;

343 struct Curl_easy curl_easy_init(void) 344 { 345 CURLcode result; 346 struct Curl_easy data; 347 348 / Make sure we inited the global SSL stuff / 349 if(!initialized) { 350 result = curl_global_init(CURL_GLOBAL_DEFAULT); 351 if(result) { 352 / something in the global init failed, return nothing / 353 DEBUGF(fprintf(stderr, "Error: curl_global_init failed\n")); 354 return NULL; 355 } 356 } .....

clearday4 commented 3 years ago

Hi @brendandburns

The 2 functions apiClient_create_with_base_path(callcurl_global_init) and apiClient_free(callcurl_global_cleanu) should not get called when there are more than 2 threads in the program. This cannot be guarded by a static boolean check.

apiClient_invoke/curl_easy_init/curl_easy_cleanup executes without such restriction.

User can create ( and delete ) the apiClient in main thread and then create worker threads, pass the pointer of apiClient to the worker threads to operate kubernetes resources in multi-thread program.

I think curl_global_init () does a boolean check on its own. so this part seems to be okay. The problem is that curl crash is definitely happening in multi-thread when calling curl_global_cleanup. Therefore, in my opinion, you should put CURL * handle in apiClient so that apiClient_free can call curl_easy_cleanup. and create another shutdown function then call curl_global_cleanup.

ityuhui commented 3 years ago

Does the below workflow not meet your requirement ?

Step 1: When program starts up, calls apiClient_create_with_base_path to create an apiClient_t client Step 2: Create one or more worker threads, pass the apiClient_t client to them for apiClient_invoke to use, if you have more than 2 worker threads, you may need a lock/mutex to control the access of apiClient_t client Step 3: After all worker threads end, main thread calls apiClient_free to free up apiClient_t client

clearday4 commented 3 years ago

Does the below workflow not meet your requirement ?

Step 1: When program starts up, calls apiClient_create_with_base_path to create an apiClient_t client Step 2: Create one or more worker threads, pass the apiClient_t client to them for apiClient_invoke to use, if you have more than 2 worker threads, you may need a lock/mutex to control the access of apiClient_t client Step 3: After all worker threads end, main thread calls apiClient_free to free up apiClient_t client

I understood that your suggestion is to share the apiClient structure in multiple API Requests. However, considering the watch function, it seems better to allocate the apiClient structure to each API Request respectively. (Considering the case that some applications watch the service and pod respectively and at the same time query the hpa) Which is better, sharing the apiClient between threads or allocating one apiClient for one API Requests?

so I think Step 2: Create one or more worker threads, allocate the apiClient_t client to them for apiClient_invoke to use Step 3: After all worker threads end, main thread calls apiClient_free(curl_easy_cleanup) to free up apiClient_t client Step 4: After application finish to use client library, call apiClient_shutdown(curl_global_cleanup) to cleanup the resources

ityuhui commented 3 years ago

Your consideration is right.

The only one apiClient/CURL handler cannot support a thread watch and the other thread operates k8s resource at the same time because the apiClient may be locked.

but curl_easy_cleanup is not needed to call by apiClient_free, because it is called by apiClient_invoke

I suggest adding 2 functions:

void apiClient_setupGlobalEnv()
{
    curl_global_init(CURL_GLOBAL_ALL);
}

void apiClient_unsetupGlobalEnv()
{
    curl_global_cleanup();
}

and remove curl_global_init(CURL_GLOBAL_ALL) from apiClient_create_with_base_path remove curl_global_cleanup(); from apiClient_free

The new workflow is:

apiClient_setupGlobalEnv (must be called in main thread without other thread ) apiClient_create_with_base_path (can be called in main thread or worker thread) (optional)apiClient_create_with_base_path (If we want another api client) ... apiClient_free ( same with apiClient_create_with_base_path) (optional)apiClient_free ( free the other api client) apiClient_unsetupGlobalEnv (must be called in main thread without other thread)

clearday4 commented 3 years ago

Yes, it exactly meets my requirements. Thank you for your investigation.