What is the proper way to make a large data thread safe

envoyproxy / envoy

Cloud-native high-performance edge/middle/service proxy

https://www.envoyproxy.io

Apache License 2.0

24.92k stars 4.8k forks source link

What is the proper way to make a large data thread safe #36586

Open YvesZHI opened 1 week ago

YvesZHI commented 1 week ago

I'm trying to develop my custom http filter.

In the http filter, there is a static std::map variable. When a http request is passing through the filter, the map would be read or be written. I know that there are some workers (threads) in the architecture of Envoy. So I think my http filter looks like as below:

workera: reqa1, reqa2, reqa3 ----\
                                 map
workerb: reqb1, reqb2, reqb3 ----/

As my understanding, each worker is just like a queue, requests in one queue will be processed one by one so there doesn't exist any race condition. However, two workers can have race condition issue. So, I don't need to use lock for reqa1 and reqa2 but I do need to use lock for reqa1 and reqb1.

If I'm right about the mechanism, could someone tell me the proper way to use lock in the source code of my http filter? For now I can only initialize one single mutex and each request will lock and unlock it, which is a very bad idea.

tyxia commented 1 week ago

You can use standard mutex etc for guarding the race condition. For example here https://github.com/envoyproxy/envoy/blob/main/source/extensions/common/dynamic_forward_proxy/dns_cache_impl.h#L177-L178

Out of curiosity, why you need a static variable though

YvesZHI commented 1 week ago

@tyxia It seems that the standard mutex in the example is simply a mutex. Nothing special. I'm kind of confused now.

If this is how we process the case of race condition in a http filter of Envoy, does it mean that reqa1 and reqa2 may have race condition too even if the two requests are assigned to the same worker? Aren't the requests in a same worker processed one after another so mutex is NOT necessary?

YvesZHI commented 1 week ago

@tyxia Why do I need a static variable:

The static variable is a std::map, which contains the uid and a temp key of client: std::map<uint64, std::string> userInfos.

When a client sends a request to the filter, it will search the temp key from the map to do some validation with the temp key, if not found, a new element will be inserted into the map.

In a word, when a client arrives, the static map would be read and/or written.

BTW, I've known about JWT filter and RBAC filter but none of them could do my own custom validation and this is why I need to develop my own http filter.

tyxia commented 1 week ago

In general, http filter data doesn't need mutex/lock. Even if it is shared between filter instances like using TLS, it is still thread local (not shared between threads). Mutex could be needed in some cases for example object is created on main thread and shared across worker threads.

You can read more about envoy threading https://blog.envoyproxy.io/envoy-threading-model-a8d44b922310