nim-lang / RFCs

A repository for your Nim proposals.
136 stars 26 forks source link

Redesign threads interface for Nim v2 #401

Open planetis-m opened 3 years ago

planetis-m commented 3 years ago

So, I would mostly like to start a discussion about how the next version of Nim will handle one of the fundamentals of multi threading, that is the Thread object and its API. Currently it's behavior is closely tied to the legacy GC mode and breaking the interface is unavoidable to properly support ARC/ORC (see the Isolated data RFC) and also beneficial since we have the opportunity to improve the interface and add more sugar. Stuff that needs to be addressed are but not limited to the following:

Opinions?

References:

konsumlamm commented 3 years ago
  • Passing data as sink Isolated[T]?

Yes, please. For what it's worth, Rust also basically does this via move semantics.

  • Syntactic sugar for thread creation.

I'm not sure this is needed, one can just pass a lambda to createThread already:

import std/sugar

var t: Thread[void]
t.createThread(() => echo "doing work")

But I wouldn't complain about some more sugar, something like:

let t = thread:
  echo "doing work"

One thing I really dislike about the current API though is that createThread takes a var Thread instead of just returning a Thread. I tried to add a wrapper for that, but ran into https://github.com/nim-lang/Nim/issues/17136 (a template would probably work, but this made me really unconfident about threads in Nim, in general, so I just ignored them). I would hope that something like this gets fixed by a redesign, but I'm not sure what causes it in the first place.

More references:

Araq commented 3 years ago

IMO the mistake in Thread[T] is the T generic parameter. A thread should be the most bare-bones low level wrapper over Windows/Posix that we can create and the callback it takes should take a single pointer parameter instead. This does not need syntax sugar at all, it's a low level API. You should use spawn instead and spawn should be based on std / tasks.

threadDestructionHandlers should be removed.

  proc pinToCpu*[Arg](t: var Thread[Arg]; cpu: Natural) =
    {.hint: "cannot change Genode thread CPU affinity after initialization".}

is a good indicator that CPU pinning should be part of createThread directly.

Varriount commented 3 years ago

One characteristic with spawn is that it does not support the idea of more than one thread pool, which I could see being a challenge for certain use-cases.

Araq commented 3 years ago

Well spawn could take the threadpool as an argument and for "structured concurrency" that would be required anyway. Or you use moduleA.spawn vs moduleB.spawn. Pretty simple problem.

Clonkk commented 3 years ago

Just my 2 cents on the threading story.

Recurring pain point of threads I've encountered :

With std/threadpool, I found that FlowVar were less practical than Future. For reference there's an excellent threadpool implementation based on Future https://github.com/yglukhov/asyncthreadpool/

Also missing - but that's loosely related - is the ability to generate openMP parallel for loop and notably to collapse nested for loop.

planetis-m commented 3 years ago

Another useful define that's missing and gets duplicated is CacheLineSize similar to http://www.hellenico.gr/cpp/w/cpp/thread/hardware_destructive_interference_size.html If someone can point me to where are those constants defined, since I can't find them, I can try making a PR.

mratsim commented 2 years ago
  proc pinToCpu*[Arg](t: var Thread[Arg]; cpu: Natural) =
    {.hint: "cannot change Genode thread CPU affinity after initialization".}

is a good indicator that CPU pinning should be part of createThread directly.

I have removed pinning for taskpools, there are too many cases and complexities:

Due to all these reasons, I wouldn't attempt to do CPU pinning in the standard library.

Cancel and/or pause a thread execution is also very practical

That should be the responsibility of the event loop running on that thread.

Pausing/cancellation is preemptive multithreading and is kernel domain (well you we use signals like Java garbage-collector ...).

Pthread doesn't expose a suspending API anyway (https://pubs.opengroup.org/onlinepubs/7908799/xsh/pthread.h.html) and their cancellation is cooperative: https://pubs.opengroup.org/onlinepubs/7908799/xsh/pthread_cancel.html

So devs should embrace cooperative scheduling and have multithreaded functions communicate by channels if synchronization is needed. This would also make cancellation points explicit which would be way easier to understand what cleanup is needed. (A cancellation channel can just be an ptr[Atomic[bool]])

Clonkk commented 2 years ago

Pausing/cancellation is preemptive multithreading and is kernel domain (well you we use signals like Java garbage-collector ...).

Well pausing should be in an event loop based around conditional variable - except the ones in std/locks are currently very basic if compared to its C++ equivalent, so it should be improved to improve multithreading. Ability to pause a thread can be interpreted as having the tools in the stdlib to implement event loops without too much friction; you could even imagine to have some simple event loops exposed as both example and to simplify trivial use case.

For cancelling, I just think having the stdlib Thread equivalent of pthread_cleanup_push, pthread_cancel, pthread_set_cancel_state without needing to call posix function is enough .

Araq commented 2 years ago

I have removed pinning for taskpools, there are too many cases and complexities: ...

The API could always ignore the request if the underlying platform doesn't support it. But it doesn't seem to be worth it, it seems the idea didn't age too well.

mratsim commented 2 years ago

Pausing/cancellation is preemptive multithreading and is kernel domain (well you we use signals like Java garbage-collector ...).

Well pausing should be in an event loop based around conditional variable - except the ones in std/locks are currently very basic if compared to its C++ equivalent, so it should be improved to improve multithreading. Ability to pause a thread can be interpreted as having the tools in the stdlib to implement event loops without too much friction; you could even imagine to have some simple event loops exposed as both example and to simplify trivial use case.

After https://github.com/nim-lang/Nim/pull/17711/files, it only lacks waiting with timeout which can be done in a PR.

My main problem to write runtime is the lack of a barrier so that after threads are started I can make sure they are all synchronized before they wreck havoc.

My main issue is that barriers are an optional pthread API and MacOS doesn't provide them ...

For cancelling, I just think having the stdlib Thread equivalent of pthread_cleanup_push, pthread_cancel, pthread_set_cancel_state without needing to call posix function is enough .

I wouldn't expose them because I don't see a use-case where there aren't a better existing alternative.

For instance the doc says:

A thread's cancellation type, determined by pthread_setcanceltype(3), may be either asynchronous or deferred (the default for new threads). Asynchronous cancelability means that the thread can be canceled at any time (usually immediately, but the system does not guarantee this). Deferred cancelability means that cancellation will be delayed until the thread next calls a function that is a cancellation point. A list of functions that are or may be cancellation points is provided in pthreads(7).

The pthread doc

Cancellation points POSIX.1 specifies that certain functions must, and certain other functions may, be cancellation points. If a thread is cancelable, its cancelability type is deferred, and a cancellation request is pending for the thread, then the thread is canceled when it calls a function that is a cancellation point.

  The following functions are required to be cancellation points by
  POSIX.1-2001 and/or POSIX.1-2008:

      accept()
      aio_suspend()
      clock_nanosleep()
      close()
      connect()
      creat()
      fcntl() F_SETLKW
      fdatasync()
      fsync()
      getmsg()
      getpmsg()
      lockf() F_LOCK
      mq_receive()
      mq_send()
      mq_timedreceive()
      mq_timedsend()
      msgrcv()
      msgsnd()
      msync()
      nanosleep()
      open()
      openat() [Added in POSIX.1-2008]
      pause()
      poll()
      pread()
      pselect()
      pthread_cond_timedwait()
      pthread_cond_wait()
      pthread_join()
      pthread_testcancel()
      ...

So all those cancellation points, besides the condition variables, are related to IO procedures.

It should be noted that even if an application is not using asynchronous cancellation, that calling a function from the above list from an asynchronous signal handler may cause the equivalent of asynchronous cancellation. The underlying user code may not expect asynchronous cancellation and the state of the user data may become inconsistent. Therefore signals should be used with caution when entering a region of deferred cancellation.

In particular, once a thread is cancelled:

Cancellation is a huge problem even when a language control everything, see:

I wouldn't add pthread cancellation before runtime writers figure out their cancellation strategy. And I have no ideas on Windows and Mac potential specificies.

mratsim commented 2 years ago

pthread_cancel doesn't work on Windows, at least a decade ago: http://blog.ezyang.com/2010/09/pthread-cancel-on-window/