alibaba / PhotonLibOS

Probably the fastest coroutine lib in the world!
https://PhotonLibOS.github.io
Apache License 2.0
843 stars 112 forks source link

Thread-per-core Architecture #382

Open kosav7r opened 5 months ago

kosav7r commented 5 months ago

Hi folks,

I'm in the process of building a storage system and evaluating PhotonLib. Same architectural decisions on the system:

  1. Thread per core, no context switching is desired
  2. Share nothing; each thread will allocate resources including memory, network(sockets). The goal is to eliminate synchronization and improve cache efficiency, this is absolutely very important.
  3. Interrupt Affinity

Do you have any recommendations on thread-to-thread communication without sharing any memory?

Coroutine-based approach is new to me. Can I achieve my basic needs on PhotonLib? If so, are there any examples?

Thanks!

kosav7r commented 5 months ago

Kindly pinging :)

beef9999 commented 5 months ago

Yes, Photon coroutines can satisfy your requirements. Every resource is located in a single thread and shared among coroutines. You can read the documents for more details.

lihuiba commented 5 months ago

Linux thread is referred to as vCPU in Photon. Each of them has a dedicated scheduler for coroutines (Photon's threads), and a dedicated instance of event engine (e.g. epoll or io_uring). Their execution is basically independent of each other, unless you conduct inter-vCPU task coordination or migration.

lihuiba commented 5 months ago

You can realize interrupt affinity in the same way you do to other applications, e.g. pinning interrupt handler and corresponding photon vCPU (Linux thread) to the same physical CPU core.

kosav7r commented 5 months ago

What would you recommend to separate CPU and IO-bound jobs in the programming models? As far as I know, coroutines are for IO-bound jobs.

beef9999 commented 5 months ago

You can use the migrate API to move your CPU bound tasks to specific vCPU. It’s lightweight

kosav7r commented 4 months ago

What would you recommend for Interprocessor communication if shared memory is absolutely no?

beef9999 commented 4 months ago

Is it a Photon related issue?

kosav7r commented 4 months ago

Already appreciate your answers so I apologize if it sounds unrelated. I am evaluating and comparing Photon with Seastar. Trying to map approaches in Seastar to Photon.

For example, In Seastar, it is mostly done by passing a lambda to a neighbor VCPU. I was wondering what do you think is a best approach to take as communication between VCPUs.

beef9999 commented 4 months ago

A Photon thread (coroutine) is essentially a function. Lambda is also the same.

The underlay implementation of thread migrate is eventfd notification and task queue.

Besides Photon also has a MPMC queue to transmit functions, encapsulated as the so-called WorkPool

loongs-zhang commented 4 months ago

I've searched the code, how about use sched_setaffinity(linux)/thread_policy_set(macos) to bind vCPU to a single CPU core? @beef9999

lihuiba commented 4 months ago

@loongs-zhang Are you suggesting that we bind vCPU by default?

lihuiba commented 4 months ago

What would you recommend for Interprocessor communication if shared memory is absolutely no?

Multi-process without sharing memory? How about UNIX domain socket?

loongs-zhang commented 4 months ago

@loongs-zhang Are you suggesting that we bind vCPU by default?

yes

loongs-zhang commented 4 months ago

What would you recommend for Interprocessor communication if shared memory is absolutely no? Multi-process without sharing memory? How about UNIX domain socket?

How about deep cloning and sharing?

lihuiba commented 4 months ago

@loongs-zhang Are you suggesting that we bind vCPU by default?

yes

As different apps require different binding configuration, it's difficult for us to do it by default. For example, a typical scenario is file/storage server. We may need to consider IRQ handlers of the NICs and SSDs, and our service threads (vCPUs). The best binding configuration should minimize CPU switching along the execution.

lihuiba commented 4 months ago

How about deep cloning and sharing?

I not sure whether cloning is feasible, as it may imply sharing in the first place, and @kosav7r said it was "absolutely no".

lihuiba commented 4 months ago

What would you recommend to separate CPU and IO-bound jobs in the programming models? As far as I know, coroutines are for IO-bound jobs.

Photon has a built-in WorkPool to deal with various kinds of background jobs. For IO-bound ones, you can initialize the worker vCPUs to enable coroutines and event engines. For CPU-bound ones, you can simple use kernel threads without initializing photon.

BTW, the jobs are efficiently passed to workers with lock-free shared memory ring queue.