Closed simonlast closed 12 years ago
Thanks for the info. I've sent a request to Franz for 9.0 Express Edition (if it exists yet).
What platform is this? And could you tell me the test it failed on? It should print FOO-TEST for some FOO.
It turns out that Allegro 8.2 had been broken. It's now fixed -- perhaps 9.0 is also fixed?
It seems to work a bit better, but the process hangs during the final benchmark, MATRIX_MUL Here's a trace:
Thread 9 (process 62284):
Thread 8 (process 62284):
Thread 7 (process 62284):
Thread 6 (process 62284):
Thread 5 (process 62284):
Thread 4 (process 62284):
Thread 3 (process 62284):
Thread 2 (process 62284):
Thread 1 (process 62284):
---Type
It also hangs during the COGNATE-HANDLER-TEST, with a similar trace
Based on the trace you posted, it appears that OS threads are not enabled (threads are on the same process). If so then the PMATRIX-MUL benchmark is probably not hanging, but merely taking a long time.
On my copy of Allegro 8.2 Express Edition, which does not have OS threads, the benchmark does eventually finish after about 20 minutes. This is OK because defpun
serves no purpose without true multiprocessing available.
I suspect the slowdown is caused by the work-stealing loop (sync
in Cilk) which can briefly spin under certain situations. However for Allegro's green (non-OS) threads the spinning is not brief; it takes over the whole process for long periods of time, effectively pausing the computation.
Perhaps for Lisps without SMP, defpun
should expand to defun
and (inside defpun
) plet
to let
. Though I wonder if non-SMP Lisps would be used with lparallel in the first place.
COGNATE-HANDLER-TEST may be unrelated. Allegro 8.2 has no problem with (loop (lparallel-test::cognate-handler-test))
. I'm waiting for Franz to reply to my 9.0 request.
I think OS threads must be enabled, because initially the program uses more than one processor, but during the PMATRIX-MUL benchmark, processor usage drops to 0, so I don't think it is simply taking a long time.
Maybe there is some explanation for why your trace shows waiting threads attached to the same process. What does (find :os-threads features)
say?
In any case, I can't do much without being able to run 9.0. (Franz hasn't responded to my request yet.) Who knows, maybe bordeaux-threads needs updating for 9.0.
(find :os-threads features)
says :OS-THREADS
With the following patch for bordeaux-threads, Allegro 9.0beta SMP successfully passes all lparallel tests. I'll submit it once I get confirmation from Franz.
diff --git a/src/impl-allegro.lisp b/src/impl-allegro.lisp
index 144ee98..102200c 100644
--- a/src/impl-allegro.lisp
+++ b/src/impl-allegro.lisp
@@ -41,12 +41,12 @@ Distributed under the MIT license (see LICENSE file)
(defun condition-wait (condition-variable lock)
(release-lock lock)
- (mp:process-wait "wait for message" #'mp:gate-open-p condition-variable)
- (acquire-lock lock)
- (mp:close-gate condition-variable))
+ (unwind-protect
+ (mp:get-semaphore condition-variable)
+ (acquire-lock lock)))
(defun condition-notify (condition-variable)
- (mp:open-gate condition-variable))
+ (mp:put-semaphore condition-variable))
(defun thread-yield ()
(mp:process-allow-schedule))
I added that patch to bordeaux-threads, and cloned the latest lparallel, but I am again getting the same error as before: Debug: Attempt to do an array operation on 0 which is not an array.
, on both the benchmark and the tests
What is your lisp-implementation-version
? Tests and benchmarks run fine under
9.0.pre-final.18 [Linux (x86) *SMP*] (Jun 6, 2012 8:36)
9.0.beta.21 \[64-bit Mac OS X (Intel) *SMP*\] (Apr 12, 2012 16:53)
Perhaps this version is too old?
Getting the latest can't hurt. Unfortunately I don't have a Mac that can run Allegro 9.0, at least currently.
I don't know if this is related to your problem, but there is an Allegro 9.0 bug affecting bordeaux-threads which is currently unresolved. It is mentioned at the end of http://lists.common-lisp.net/pipermail/bordeaux-threads-devel/2012-June/000204.html
After applying the patches in the link, see if (loop (5am:debug! 'bordeaux-threads-test::stress-test))
hangs or produces an error. If you would rather not wait for Franz to fix it, you could try removing the *thread-results*
hash in impl-allegro.lisp in bordeaux-threads.
I could upgrade my Mac to 10.6, but I would prefer knowing beforehand that a problem still exists. If the very latest Allegro 9.0 for Mac still fails with the above patch and the following patch, then I'll know.
diff --git a/src/impl-allegro.lisp b/src/impl-allegro.lisp index d9ea53b..0432690 100644 --- a/src/impl-allegro.lisp +++ b/src/impl-allegro.lisp @@ -55,18 +55,8 @@ Distributed under the MIT license (see LICENSE file) (defun start-multiprocessing () (mp:start-scheduler)) -(defvar *thread-results* (make-hash-table :weak-keys t)) - -(defvar *thread-join-lock* (make-lock "Bordeaux threads join lock")) - (defun %make-thread (function name) - (mp:process-run-function - name - (lambda () - (let ((result (funcall function))) - (with-lock-held (*thread-join-lock*) - (setf (gethash (current-thread) *thread-results*) - result)))))) + (mp:process-run-function name function)) (defun current-thread () mp:*current-process*) @@ -102,10 +92,6 @@ Distributed under the MIT license (see LICENSE file) (defun join-thread (thread) (mp:process-wait (format nil "Waiting for thread ~A to complete" thread) (complement #'mp:process-alive-p) - thread) - (with-lock-held (*thread-join-lock*) - (prog1 - (gethash thread *thread-results*) - (remhash thread *thread-results*)))) + thread)) (mark-supported)
With those 3 patches, I still get the same errors. I'm going to try to get the latest ACL soon
Any word on this? I am currently unable to test 9.0 SMP because my beta license has expired. The latest bordeaux-threads in the repository uses built-in condition variables, which should be more robust.
I'll forward this to someone who could test it. I no longer have Allegro Lisp
Hi lmj, Simon was an intern we had investigating SMP support in ACL 9.0 over the summer. Using the versions of lparallel and bordeaux-threads on github all lparallel tests pass using x64 ACL 9.0 on Linux and OSX Mountain Lion, this wasn't the case when Simon was investigating lparallel. The versions installed via quicklisp still segfault when running the lparallel test suite. I've also used lparallel a few places experimentally and everything as worked well using the code on github.
Thanks for your great work, I really like the library! Andrew
Thanks for the update. I had planned on contacting Franz once quicklisp got the new bordeaux-threads (less hassle for them), but it looks resolved now.
I'm trying to run the tests and benchmarks in Allegro 9.0, but I am getting several errors
For the tests: Debug: Attempt to do an array operation on 0 which is not an array.
For the benchmarks: Error: No methods applicable for generic function # with args (NIL) of classes (NULL)