Open bjosv opened 1 year ago
Thanks for raising the issue. In general, would be nice to avoid native platform specific API calls within the SDK. But feel free to raise a PR to discuss further. Another options could be to optionally pass some thread-factory handle through BatchSpanProcessorOptions, and let it generate the thread with required affinity.
Probably I was not very clean with thread-factory/ thread-pool approach. In simplest form for illustration, it can be something as below, and then the application can pass the custom thread-pool implementation by inheriting ThreadPool class. In real world, it would be bit more complex, as we may have to manage synchronization, result from threads etc.
class ThreadPool {
public:
virtual void Schedule(std::function<void(void)> func) = 0;
};
class DefaultThreadPool : public ThreadPool {
public:
void Schedule(std::function<void(void)> func) override
{
threads.push_back(std::thread(func));
}
~DefaultThreadPool() {
for (auto &thread: threads){
if (thread.joinable()) {
thread.join();
}
}
}
private:
std::vector<std::thread> threads;
};
struct BatchSpanProcessorOptions
{
std::unique_ptr<ThreadPool> pool = std::unique_ptr<DefaultThreadPool> (new DefaultThreadPool);
};
class BatchSpanProcessor{
public:
BatchSpanProcessor(const BatchSpanProcessorOptions &options)
{
options.pool->Schedule([this](){this->DoBackgroundWork();});
}
void DoBackgroundWork() {
}
};
This is more generalized solution to support application which have their own thread-pool, along with addressing current issue. But still would be good to have draft PR with affinity changes in current implementation to discuss and compare further.
Thanks for the example. This seem like a much better approach than what I had in mind. I will do some experiments with it, and hopefully prepare a draft PR for further discussions.
This issue was marked as stale due to lack of activity.
This issue was marked as stale due to lack of activity.
This needs to be supported.
Is your feature request related to a problem? We would like to use the batch processor in a system where other system parts requires low latency and high throughput. Today this is obtained by running the processes in a strict fashion and pinning threads to CPUs.
When using the batch processor it starts a worker thread which runs on any CPU, which gives degraded performance.
Describe the solution you'd like We preferably would like to use the provided batch processor and one proposal would be that we add an CPU-affinity-mask option to the
BatchSpanProcessorOptions
.When this option is configured we set the given cpu affinity using
std::thread::native_handle
onworker_thread_
and then set the mask viapthread_setaffinity_np()
. This could possibly be a Linux only feature, but Windows has something similar viaSetProcessAffinityMask()
.Describe alternatives you've considered
SimpleSpanProcessor
without a thread, but we believe it might block. Async IO seems not yet available.Would a PR for a feature like this possibly be accepted? Are there other options that I haven't yet considered?