Architectural provisions for multiple threads

triska commented 4 years ago

Rust provides facilities for running multiple threads in parallel:

https://doc.rust-lang.org/1.30.0/book/second-edition/ch16-01-threads.html

Would it be possible to provide a Prolog predicate such as thread_create(Goal, ID) that runs Goal in its own thread, which can be waited for with thread_join(ID)?

If this is not possible at the moment, could you please take this into account in the architecture design of Scryer Prolog, to implement it in the future?

Especially web servers would benefit tremendously from being able to handle each client request in its dedicated thread.

Considering how such a predicate thread_create/2 could be implemented in the current code base may be an interesting issue for Rust programmers.

pmoura commented 4 years ago

See also the following standardization proposal for threads in Prolog:

threads.pdf

mthom commented 4 years ago

It will be a good challenge to implement a concurrent WAM that doesn't exact a cost to single-threaded programs. I'm not sure where to start. I have plenty of work ahead of me before I get near threads, though.

triska commented 4 years ago

One architecture that suggests itself is to equip each thread with its own dedicated WAM interpreter.

As a user who is interested in this feature, my expectation is not that the implementation of threads has no overhead. Rather, it would already be great if performance is sufficient to create interesting applications such as web servers. Even a very rudimentary implementation of threads would already suffice for interesting experiments for instance with Torbjörn Lager's Web Prolog.

It is completely understandable that other features have priority at the moment. I only hope that eventual multi-threading is taken into account in architectural decisions.

matt2xu commented 4 years ago

On a side note, as far as Web back-end programming is concerned, (OS) threads are not really considered useful anymore since most frameworks are using asynchronous I/O instead (Node.js of course, or in the Java world Vert.x, or Play in Scala, etc.). Erlang was a notorious precursor by supporting lightweight "threads" long before that (even though it took them a while to add support for multi-processor systems).

triska commented 4 years ago

Yes, to clarify: Certainly there does not need to be any kind of correspondence to OS-level threads at all, at least not necessarily.

For writing web servers in Prolog, all that matters is that there is a mechanism in place by which a goal that serves a client can be spawned off so that the main process can continue waiting for further connections, and dispatching them to their own dedicated handlers that run in parallel.

aarroyoc commented 3 years ago

A Tokio thread is probably what we're looking for. It has its downsides, like that we should specify the number of OS-threads it can use at startup and we must use Tokio primitives rather than Rust std ones. But it should be better for anything else.

triska commented 1 year ago

It seems #1980 is already a great step in this direction, making it possible to run multiple machines concurrently? Is there a way that this functionality can be sensibly exposed to Prolog programs themselves, @Skgland?

A key use case where this would be desirable: Have an HTTP server listen for connections, and as soon as a connection arrives, start a new thread that handles the new connection (requests and responses) for as long as it takes, while the main program goes back to waiting for new connections, and launches more threads when more requests arrive.

Skgland commented 1 year ago

I think having a thread version of approximately this:

spawn_machine(G) :- format(string(CMD), 'scryer -g ~W'), os:shell(CMD).

With basically completely independent MachineState should be relatively simple.

Having an actually useful implementation, for your mentioned key use case, with an interface as from the proposal mentioned by \@pomoura https://github.com/mthom/scryer-prolog/issues/546#issuecomment-632140433, with a shared Database, shared Resources (like Network Connections and File Streams), Message Queues and so forth is a completely different bulwark, that I don't even have an Idea where to begin. I can't think of a lower-level API that is feasible to create such a "sibling" Machine. Having the first might be still something one would want to have a secondary independent machine, the proposed thread API does not appear to allow that, though it might be possible to extend the thread options to signal the creation of an "estranged" Machine that would not share the Database etc. That might be useful for tests to run in anther thread and an independent machine to not have the potential to pollute the machine running the test harness. Though one would probably need a way to at least provide stdin, capture stdout/stderr and query the result of the machine, to make it useful, this could be feasible as each thread would only have one and of each channel effectively not sharing resources.

mthom / scryer-prolog

Architectural provisions for multiple threads #546