Closed stugol closed 8 years ago
Threads aren't supported in Crystal, and there is no plan to give direct access to them anymore. A nicer solution, like a multithreaded event loop will eventually happen (see other issues about parallelism).
Yet, Crystal is capable to run concurrent code, in an event loop. Just:
spawn do
do_something
end
spawn do
do_something_else
end
And you'll have 2 concurrent piece of code running.
How do I wait for the spawned process(es) to finish before continuing with my main program?
You use channels.
ch = Channel(Nil).new
spawn do
do_something
ch.send(nil)
end
spawn do
do_something_else
ch.send(nil)
end
2.times { ch.receive }
This concurrency model is very similar to that of Go, but it's still a work in progress. If in doubt, you can search how to do it in Go and then try to do the same in Crystal, though everything isn't yet supported here.
I see. Fair enough.
However, as I understand it, spawn
uses Fibers - meaning only one "thread" executes at a time. If one of the "concurrent" pieces of code blocks execution, the entire program will grind to a halt. This isn't how I would define "concurrency".
In the future Fibers might be scheduled in different threads, or you might have manual control of threads. But it's still a work in progress.
Alright. So how can I fix this? https://play.crystal-lang.org/#/r/osy
Is there a non-blocking (or at least sleep
ing) version of accept
?
accept
is non-blocking, as well as any net IO. When there's a wait on an IO the runtime scheduler gives control to a new fiber, until that IO is available. The code you showed should work fine (I just tried it), but code like that won't work on play.crystal-lang.org.
It blocks the main program execution. I never get "2" printed.
sleep(1000)
waits 1000 seconds
Oh. Well, I guess that might be something to do with it... ;)
Note that concurrency isn't parallelism. Concurrency is achieved by fibers, but yes, fibers won't run in parallel (but that will come).
Code inside a spawn
seems to be unable to use variables declared outside the block....?
@stugol It should work. Do you have an example where it doesn't?
Also, how can I terminate a spawn from outside? Before you say "pass it a message via a channel", can I just point out that client = TCPServer.accept; client.read_line
seems to be blocking.
I did have an example, but I changed it to use a channel; and now I can't reproduce it...
@stugol It is really a bad idea to terminate Fibers (or Threads, if we had them).
For example, you know, that Ruby's timeout
stdlib spawns new thread with your code and terminates it from the main thread after timeout passed if it is still alive?
What happens is that you have 3 or 4 different states these threads can be in. Resulting in possible race conditions, broken database connections, production downtime (I dealt once with that, actually).
That is why other languages disallow voluntary termination of the Thread. Instead, they provide a way to interrupt thread, which effectively only sets special flag on thread, that it is politely asked to be interrupted. And it is a job of a thread in question to periodically check this flag and finish itself (by raise
-ing or break
/return
-ing). In such languages, usually all IO and stdlib is written in a way that is aware of this behavior and acts correctly, when thread is asked to terminate. That is why you usually don't have to deal with it, unless you are implementor of low-level library.
follow up to my previous comment:
I am quite sure, that we don't have such a functionality here in Crystal. And I think it is a good idea to have it once we start working on concurrency and parallelism with more focus.
AFAIK, Go's take on this is to use select { case ... }
construct to listen on multiple IO and channels simultaneously, thus providing a way to simultaneously block on IO and wait for message on channel, thus providing an elegant way to finish a goroutine, when asked. I am not sure what happens to the IO operation in question after that, though.
I think that true parallelism would be a big selling feature for Crystal. It's something that's definitely lacking from Ruby and other languages.
@mikejholly The last time this was brought up, I think someone mentioned that parallelism will be added eventually, but I don't think it will expose threads (a la C/C++) since those will be just an implementation detail for how parallelism works; it will probably be abstracted away, in a similar way how the spawn
or parallel
(this just uses spawn
Fiber
s, btw) macros do now.
@omninonsense Personally, I think Threads should be available BUT the nice abstractions like spawn
and parallel
should be preferred. I think Clojure does this well.
So if the only way to achieve something (listening on a port, reading user input etc) happens to be a blocking call - or long-running loop - my entire program breaks? That doesn't sound useful.
@stugol, If you do a blocking call inside of a Fiber, all other Fibers will continue their execution normally, because scheduler will put blocked Fiber to sleep until blocking call is completed.
Best Regards, Oleksii Fedorov, Sr Ruby, Clojure, Crystal, Golang Developer, Microservices Backend Engineer, +49 15757 486 476
On Thu, Dec 24, 2015 at 8:16 PM, stugol notifications@github.com wrote:
So if the only way to achieve something (listening on a port, reading user input etc) happens to be a blocking call - or long-running loop - my entire program breaks? That doesn't sound useful.
— Reply to this email directly or view it on GitHub https://github.com/manastech/crystal/issues/1967#issuecomment-167150982.
No, that doesn't make sense. Consider this pseudocode:
def GetInput
do
read a character from the keyboard
until [user has pressed enter]
end
Fiber 1 {
GetInput()
}
Fiber 2 {
[some other long-running task]
}
Granted, spinwaiting isn't a very good way to go about things, but the point is, GetInput
is a blocking call. Surely the second fiber can't proceed until the user presses the enter key? Fiber-based concurrency generally requires polite sleep()
ing from all participants; and if one fiber misbehaves, the entire program is up the swanny.
Windows 3.1 used to work like that. All modern OSs use preemptive multitasking, and there's a good reason for it.
Sure, fibers are really awesome. But there are cases where they're not appropriate; and in such cases, it would appear that Crystal doesn't offer any alternatives at all. I'm actually considering switching to Google Go for this project; simply because I anticipate a lot of concurrency, and don't want to be utterly boned when I eventually come across a blocking call I cannot avoid using.
read a character from the keyboard
- is a non-blocking IO, at least it should be. Therefore it will trigger proper re-scheduling for other fibers.Fiber
=> Goroutine
.@stugol I might be wrong, but it looks like you are mixing 2 different concepts here: concurrency and parallelism.
They are connected, but very different. What you see here in Crystal (and in Go) is pure implementation of concurrent model (called CSP), and it is actually complete.
Threads - is just one way to do parallelism. Very poor way at that.
In Go, parallelism is achieved by telling go program to run over N cores instead of one. In that case, sometimes some goroutines will actually run on different cores at the same time. This type of parallelism will not change your coding style at all, and your program should not be different, if it is run on 1 core or N. This has one limitation - max value of N is determined by amount of CPU threads (usually cores * 2).
From Go blog on the topic: http://blog.golang.org/concurrency-is-not-parallelism - "Concurrency is not parallelism"
Parallelism make sense only for number crunch tasks not IO operations. In go if you really need do parallel math you shouldn't run multiple copies, but limiting this computations with buffered channels where channel capacity = number of CPU cores.
"Goroutines are multiplexed as needed onto system threads. When a goroutine executes a blocking system call, no other goroutine is blocked."
The same cannot be said for Crystal, as I understand it.
If I need to perform parallel number crunching - or if, for some reason, an IO call is blocking and there's nothing I can do about it - how can I solve this in Crystal?
As far as such rare case goes, I suppose you will have to wait until the language designers come up with solution for that. They are still in progress of designing concurrency and parallelism at the moment as far as I know.
I would expect that it will be similar to go's approach, i.e. ability to parallelize fibers on cores and buffered channels with size of number of cores. Though it is my speculation, assumption.
If you need something right now, I would recommend actually using Go. In my opinion concurrency model in Go is very good one.
BTW to my knowledge it is discouraged to try to contribute to this part of the language at the moment, before there is high-level vision for this.
Alternative solution to this problem is to create standalone library for threads or for parallelizing of fibers. Probably it will use some C library. It is a nice idea, since in the progress one would learn a lot of internals and overall how it works and should work.
Von meinem iPhone gesendet
Am 25.12.2015 um 3:47 PM schrieb stugol notifications@github.com:
"Goroutines are multiplexed as needed onto system threads. When a goroutine executes a blocking system call, no other goroutine is blocked."
The same cannot be said for Crystal, as I understand it.
If I need to perform parallel number crunching - or if, for some reason, an IO call is blocking and there's nothing I can do about it - how can I solve this in Crystal?
— Reply to this email directly or view it on GitHub.
Goroutines are multiplexed as needed onto system threads. When a goroutine executes a blocking system call, no other goroutine is blocked
I believe such system calls are executed on separate thread(s). So they will still not block other fibers. Am I correct or not @asterite?
Von meinem iPhone gesendet
Am 25.12.2015 um 3:47 PM schrieb stugol notifications@github.com:
Goroutines are multiplexed as needed onto system threads. When a goroutine executes a blocking system call, no other goroutine is blocked
Is it possible, using Crystal, to make API calls to create threads? For example (pseudocode):
def threaded_fn
...long-running blocking operations...
end
thread = WinAPI.CreateThread(address-of(threaded_fn))
Obviously I wouldn't expect you guys to necessarily be experts on the Windows API, it's just an example. In C, I can call CreateThread (or the Linux equivalent) and it returns a handle. I would need to obtain the address of threaded_fn
in a manner compatible with C. Can this be done?
In essence, how do I make a C-style function pointer to a Crystal function? For example, CreateThread expects a function pointer with the signature unsigned long __stdcall (*fn)(void*)
. So presumably I would need def threaded_fn(param)
and a way to make a function pointer from it. I'm a bit hazy on how to do that.
I don't mind using Crystal's built-in concurrency support; but I need the option to use real threads if the need arises.
Hm. Re-reading the documentation, it seems that I simply specify - when importing the C call - that it takes a callback. So:
@[Link("kernel32")]
lib Kernel32
fun CreateThread(attrs : SECURITY_ATTRIBUTES*, stacksize : SIZE_T, callback : void* -> UInt32, param : void*, creation_flags : UInt32, thread_id : UInt32*)
end
def threaded_fn(param : void*)
end
thread_handle = Kernel32.CreateThread(nil, 0, threaded_fn, nil, 0, nil);
Or something to that effect?
(API documentation: https://msdn.microsoft.com/en-us/library/windows/desktop/ms682453(v=vs.85).aspx)
In which case, what happens when I deploy the program on a non-Windows platform? If I simply refrain from calling Kernel32
functions (and call others instead, obviously), will the lib
declaration simply be ignored? Or will an error be raised because kernel32.dll
doesn't exist?
I'm trying to write cross-platform code.
Does Crystal actually work on Windows right now? I would think you'd have to just use pthreads or something...
What's a pthread?
EDIT: Oh, a Linux thing. Hm. Well, Linux-only is okay for the moment. Will my plan work, do you think?
Actually, it's a Posix thing...
Isn't that kinda the same thing? ;)
No; Posix also works on OSX, BSD, and other Unix-y systems.
I've managed to achieve threading using API calls. It wasn't easy, but it works; and I can pass in arbitrary data. Example:
thread1 = threaded {
DoWork1()
}
thread2 = threaded(27) { |n|
DoWork2(n)
}
thread1.join
thread2.join
The magic:
@[Link(ldflags: "-pthread")]
lib ThreadLib
type Thread_t = UInt64
type Thread_attr_t = Int64
fun pthread_create(lpThread_id : Thread_t*, attr : Thread_attr_t*, callback : Void* -> Void*, arg : Void*) : Int64
fun pthread_join(thread_id : Thread_t, value_ptr : Void**) : Int64
end
class ThreadException < Exception; end
def runcode(arg, &block); yield arg; end
def runcode(&block); yield; end
macro threaded(&block)
%success = ThreadLib.pthread_create(out %thread_id, nil, ->(%param : Void*) {
runcode {{block}}
return Pointer(Void).null
}, Pointer(Void).null)
raise ThreadException.new("Could not create thread") unless %success == 0
ThreadMgr.new %thread_id
end
macro threaded(param, &block)
%success = ThreadLib.pthread_create(out %thread_id, nil, ->(%param) {
runcode (%param as typeof({{param}})*).value {{block}}
return Pointer(Void).null
}, pointerof({{param.id}}) as Void*)
raise ThreadException.new("Could not create thread") unless %success == 0
ThreadMgr.new %thread_id
end
class ThreadMgr
def initialize(thread_id)
@thread_id = thread_id
end
def join
ThreadLib.pthread_join(@thread_id, nil)
end
end
Unfortunately, loads of built-in Crystal functions are not thread-safe - indeed, they don't even work in native threads. For example, any attempt to puts
to the console runs the risk of a could not get the current fiber
exception from IO::FileDescriptor#wait_writable
...
I guess this solution will work for computation and such; but large parts of Crystal will simply break if I try to thread them. The question is: Should I persevere with Crystal on this project, or switch to Go instead? I'm trying to write a backup server, and it has to be reliable. And more to the point, I don't want to get halfway finished and suddenly find myself unable to proceed without proper parallelism support.
For example, when a client connects, there will be a massive amount of checksumming and other math-related stuff happening; and I don't want to block any other work. How would I write a long-running mathematically-intensive computation to play nicely with other fibers? Throw a bunch of sleep(0)
s in there?
You could just do thr = Thread.new { ... }
and have a thread running. It works, but its unsupported, and the thread will crash immediately on any IO calls, because the event loop can't be found for the current thread. Actually, even a sleep 1
will crash the thread.
Again: coroutines will be multiplexed across a pool of threads (or something like that), so a CPU bound coroutine won't block other coroutines until it's done, but this is not possible, now.
I think it is a bit early to use Crystal for something that "has to be reliable", especially if you see such a big blocker on the road already. Yes, some companies are actually using Crystal in production already, basically, they acknowledged all risks, that this involves and decided that for a certain parts of their product it is fine to use non-production-ready language. (They do it, because they are sure, that they can rewrite one of these parts in the matter of hours if the need arises).
So my advice here would be - use Go.
If you want to use Crystal right now - try splitting the server in multiple independent communicating parts: one or two in Go, where you care about parallelization, everything else in Crystal; communicating through a socket or something else.
Best Regards, Oleksii Fedorov, Sr Ruby, Clojure, Crystal, Golang Developer, Microservices Backend Engineer, +49 15757 486 476
On Fri, Dec 25, 2015 at 9:33 PM, stugol notifications@github.com wrote:
I've managed to achieve threading using API calls. It wasn't easy, but it works; and I can pass in arbitrary data. Example:
thread1 = threaded { DoWork1() } thread2 = threaded(27) { |n| DoWork2(n) } thread1.join thread2.join
The magic:
@[Link(ldflags: "-pthread")] lib ThreadLib type Thread_t = UInt64 type Thread_attr_t = Int64 fun pthread_create(lpThread_id : Threadt, attr : Thread_attrt, callback : Void* -> Void, arg : Void) : Int64 fun pthread_join(thread_id : Thread_t, value_ptr : Void**) : Int64 end
class ThreadException < Exception; end
def runcode(arg, &block); yield arg; end def runcode(&block); yield; end macro threaded(&block) %success = ThreadLib.pthread_create(out %threadid, nil, ->(%param : Void) { runcode {{block}} return Pointer(Void).null }, Pointer(Void).null) raise ThreadException.new("Could not create thread") unless %success == 0 ThreadMgr.new %thread_id end macro threaded(param, &block) %success = ThreadLib.pthread_create(out %threadid, nil, ->(%param) { runcode (%param as typeof({{param}})).value {{block}} return Pointer(Void).null }, pointerof({{param.id}}) as Void*) raise ThreadException.new("Could not create thread") unless %success == 0 ThreadMgr.new %thread_id end
class ThreadMgr def initialize(thread_id) @thread_id = thread_id end def join ThreadLib.pthread_join(@thread_id, nil) end end
Unfortunately, loads of built-in Crystal functions are not thread-safe - indeed, they don't even work in native threads. For example, any attempt to puts to the console runs the risk of a could not get the current fiber exception from IO::FileDescriptor#wait_writable...
I guess this solution will work for computation and such; but large parts of Crystal will simply break if I try to thread them. The question is: Should I persevere with Crystal on this project, or switch to Go instead? I'm trying to write a backup server, and it has to be reliable. And more to the point, I don't want to get halfway finished and suddenly find myself unable to proceed without proper parallelism support.
For example, when a client connects, there will be a massive amount of checksumming and other math-related stuff happening; and I don't want to block any other work. How would I write a long-running mathematically-intensive computation to play nicely with other fibers? Throw a bunch of sleep(0)s in there?
— Reply to this email directly or view it on GitHub https://github.com/manastech/crystal/issues/1967#issuecomment-167261460.
You can't compute in threads, because you can't communicate back the result, or even just ask the question, since IO, Channel, etc are broken in threads.
A solution to your problem, now, could be to fork { ... }
and talk through an IO.pipe
or a UNIXSocket for example.
@ysbaddaden: Regarding Thread.new
, presumably I would have to use the sleep
kernel function, rather than the Crystal one. But in any event, there would be quite a large number of things I couldn't do - IO, for example, or thread synchronisation. So maybe it's just not feasible to use threads at this time, even thought it can actually be done.
fork
might be the answer, but I'm a bit uncertain how it works. Does it literally create two processes, just as if I ran the program twice to begin with? I guess I'd create the IO.pipe
first, and both processes would then have access to it because it was created before the fork took place?
I don't have much experience with pipes - other than the type you use in a terminal, that is. Do you have an example of usage?
@waterlink: To write the program in two different languages, presumably I would have to produce two separate executables, run them both, and somehow have them talk to each other? That seems highly inefficient, not to mention a colossal pain in the arse. If one part could load the other part as a DLL/SO or something, then maybe it would be a more sane plan; but afaik this is not possible.
@ysbaddaden: Wouldn't fork
open a can of worms though? I'd end up with potentially half a dozen copies of the executable running. If any of them crashed or were terminated, it would impact the master program which would continue waiting for messages from the dead process. Also, by definition, these forks would be running blocking code (else I could just use fibers), so communicating with them (or terminating them) in a timely manner would be difficult.
Then again, I guess I'd have similar problems with threads. If it's "bad form" to simply kill a thread, and instead I should ask it politely to terminate, then blocking calls will bite me in the arse there as well.
I guess the most important question is, then: How do I ensure that a fiber plays nice with others?
For example (pseudocode):
fiber {
while [condition]
[some complex maths]
end
}
Do I just stick a sleep(0)
call in the loop? Or is there a better way to avoid blocking other fibers?
@stugol sleep(0)
in your example should be the best bet.
On 2 languages, you are totally right, that is why I told If you want to use Crystal right now
, meaning like "badly want to use it and go out of your way to do something complex". On the other hand it could be a very interesting experience :)
You may Scheduler.yield
(I think), but this may be internal/private API, I'm not sure.
@waterlink having looked into it further, I'm afraid Go isn't going to do. It lacks overloading AND generics, meaning it's essentially useless. You need at least one of those (IMO) to get anything done at all. Opinions aside, it completely violates the DRY principle, as can be seen here.
It also lacks nullable variables, extension methods and a ternary operator. You can't even use an if
as a ternary:
result = if a then b else c // error!
Pish!
@stugol Haha, that is just wonderful how you discovered it so quickly. I can only agree. Though it does have good concurrency model; honestly, concurrency model is the only thing I like in Go. So if you need very fast, concurrent, parallelizable simple application, then Go is pretty decent choice. If you anticipate a bigger codebase size, or more rapid change of the codebase, it is not the best choice. You can try Scala with Akka actors (or Akka streams) in this case, I suppose. Or you could, again, make 2 applications, one that is critical to be fast and parallelizable (in any language of your choice that allows to make this job done, not necessary the language of your liking) and other part is for everything else in a language you like; making them communicate correctly is a good exercise :)
How do I create a thread in Crystal? I have code that needs to run in parallel. The source file for the Thread class says not to use it; but the alternative is not clear. Also, I am uncertain if the alternatives will actually be helpful in this case. All I want to do is normal multithreading. How is this done using Crystal?