Closed lihaoyi closed 11 years ago
This issue reminds me of @mrocklin 's question on StackOverflow: http://stackoverflow.com/questions/12415316/dead-simple-fork-join-concurrency-in-clojure
which is similar to your second example. What use cases aren't covered by this variant?
This is an interesting example which may help show some of the edges of what macropy
can and cannot do.
I'm not that familiar with clojure, but if that is what I think it is, it's basically exactly what I'm proposing here, except...
The second bit is important; on the JVM, you can simply close over lexical scope and pass the closure to other threads to munch on before returning the result. On Python, because of the GIL, that's not possible, so we'll have to multiprocess. That would mean we'd probably want to explicitly delimit which parts are closed over (the u%
s above), and will need to be pickled-transmitted-unpickled, and which parts are taken from the destination process' scope (everything else).
Other than that, it's basically the same idea. You can do it in scala, you can do it in clojure, i'd love to be able to do it in python.
Ah, there are actually two cases of interest:
Example macros handling both multiprocessing (forks) and threading would be of interest, I'd expect.
I didn't actually consider "actual" forking be a real possibility; I probably should have made that clear =D. I meant task/fiber/goroutine/etc. style forking, implemented using erlang-style M:N scheduling to run on a thread-pool or process-pool. OS threads and processes are waaaaay to heavyweight to go around forking them!
I'm still doubtful of the utility of multithreading with the GIL; most of the cases where the GIL isn't a problem (e.g. blocking web requests) are probably better handled with asynchrony, since there's no sense keeping a bunch of threads block if you can help it.
Of course, all that stuff may be a premature optimization, since if we want maximum CPU utilization I probably wouldn't be using python anyway
Multithreading in Python can be very useful and leads to quite simple code in certain circumstances -- much simpler than event-driven / async, IMO. The most obvious example is where you have multiple things which have to happen concurrently but which depend very little on each other, or not at all - for example, when you need to finish some computation that is needed "later", but the main program decides when to pick it up (rather than handling a "done" event). Clojure's futures are an expression of this common idiom. The really bad case with the GIL is where one thread is CPU-bound and one I/O bound -- a common enough case, but by no means the only relevant one.
Processes are indeed heavyweight; threads are super-lightweight in comparison, by about a factor of 100 (see http://eigenhombre.com/2013/04/19/processes-vs-threads-for-testing/, unfortunately down at the moment due to GitHub problems with GitHub Pages).
In any case, if I get some time I might try to take a stab at writing a macro similar to the one in the SO question under discussion, if only to learn the limits of what MacroPy can do -- great work, by the way!!!
I meant async in the implementation, not the interface! e.g. instead of forking 10 threads, send 10 tasks off onto our thread pool and wait for them to come back. Tasks would be 100 times lighter than threads which are 100 times lighter thanprocesses; Ideally we wouldn't forking any kernel resources and all this would be user-land scheduling.
I think that for shared-memory thread-pool fork-join you don't need macros, all you need are a nice syntax for thunks (which you can easily send to other threads because they all live in the same memory space); for basic multithreaded-evaluation-of-expressions, a normal function which you pass f%
quick lambdas should be sufficient.
Don't let me discourage you from giving it a shot though; it would be great if someone other than the two of us could give macropy a spin and tell us what you think =)
I'm going to close this issue; after two weeks of inactivity, I don't expect someone to step and and present a solution any time soon, and I personally aren't that excited in it anymore (the multiprocessing
module is pretty good!) so probably won't spend the time to take a shot at it. Feel free to give it a shot anyway, but I'd like to clean up the issue tracker =)
Although it is possible to grab the bytecode of a function and send it to a separate process to be run in parallel, macros would let us parallelize tasks on a sub-function level. Exactly how they are run (process pool, thread pool, remote machines) is completely abstracted away. Another syntax that looks nicer but only works for expressions may be:
Macros could also give us a concise way of emulating "lexical scoping", by delimiting which parts of the forked code are to be evaluated locally:
In this case, the
u%
syntax indicates thatvalue1
andvalue2
are to be evaluated in the local scope and their value sent over to the forked task, rather than sending over the namesvalue1
orvalue2
which may not exist in the scope where the forked task is executing. There are probably a bunch of other clever things you could do with macros, along with a bunch of pitfalls to do with concurrency/parallelism, neither of which has been thought through.