Closed mmcdon20 closed 2 years ago
@mmcdon20 I'll test your branch on the 5950X before continuing the review of this PR.
@mmcdon20 Looks like we're not out of the woods yet. I've done some good old stdout.writeln
debugging, and it looks like the hang occurs at the await Future.delayed
.
I've decorated the main
function in PrimeDartParallel.dart as follows:
Future<void> main() async {
final processors = Platform.numberOfProcessors;
final pool = await WorkerPool.init(numberOfWorkers: processors);
stdout.writeln('After WorkerPool.init');
final timer = Stopwatch()..start();
pool.broadcast(const Start(work: work));
stdout.writeln('After pool.broadcast Start');
await Future.delayed(const Duration(seconds: 5));
stdout.writeln('After Future.delayed');
pool.broadcast(const Stop());
stdout.writeln('After pool.broadcast Stop');
timer.stop();
stdout.writeln('After timer.stop');
final duration = timer.elapsedMicroseconds / 1000000;
final passes = await pool.passes();
stdout.writeln(
'eagerestwolf&mmcdon20_8bit_par;$passes;$duration;$processors;algorithm=base,faithful=yes,bits=8');
}
If I started a timed docker run of only PrimeDartParallel and kill the main PrimeDartParallel process after about a minute, this is the output I get:
$ time docker run dart1
After WorkerPool.init
After pool.broadcast Start
real 0m58.492s
user 0m0.014s
sys 0m0.028s
As you can see, the last line written to stdout is the one before await Future.delayed
.
That is very surprising.
Does it hang if the program is just this?
Future<void> main() async {
print('Start');
await Future.delayed(const Duration(seconds: 5));
print('Stop');
}
That does work:
$ time docker run dart1
Start
Stop
real 0m5.462s
user 0m0.020s
sys 0m0.020s
@rbergen just pushed a rewrite of the parallel runners, maybe this implementation will work better.
This yields the following result:
$ time docker run dart1
eagerestwolf&mmcdon20_8bit_par;66345;5.010128;32;algorithm=base,faithful=yes,bits=8
real 0m10.460s
user 0m0.020s
sys 0m0.021s
In other words: the PrimeDraftParallel.dart implementation does finish executing now. What I find interesting is that it seems to have an actual run time that is about double the 5 seconds it reports.
I had changed the way the timing is measured. Each isolate runner is now timing themselves and sending back their time, and I am printing the maximum of those times in the output. The discrepancy in time might be due to the additional time it takes to spawn the isolates, and combine the results.
I've just performed a benchmark run of the updated solution:
$ make DIRECTORY=PrimeDart
make[1]: Entering directory '/home/rutger/Primes/tools/node_modules/node-uname/build'
CC(target) Release/obj.target/uname/uname.o
SOLINK_MODULE(target) Release/obj.target/uname.node
COPY Release/uname.node
make[1]: Leaving directory '/home/rutger/Primes/tools/node_modules/node-uname/build'
added 231 packages in 2.224s
info: Unconfined mode: false
info: Detected architecture: amd64
info: [PrimeDart][solution_1] Building...
info: [PrimeDart][solution_1] Running...
Single-threaded
┌───────┬────────────────┬──────────┬────────────────────────────┬────────┬──────────┬─────────┬───────────┬──────────┬──────┬───────────────┐
│ Index │ Implementation │ Solution │ Label │ Passes │ Duration │ Threads │ Algorithm │ Faithful │ Bits │ Passes/Second │
├───────┼────────────────┼──────────┼────────────────────────────┼────────┼──────────┼─────────┼───────────┼──────────┼──────┼───────────────┤
│ 1 │ dart │ 1 │ eagerestwolf&mmcdon20_8bit │ 6678 │ 5.00050 │ 1 │ base │ yes │ 8 │ 1335.46645 │
│ 2 │ dart │ 1 │ eagerestwolf&mmcdon20_1bit │ 5757 │ 5.00033 │ 1 │ base │ yes │ 1 │ 1151.32332 │
└───────┴────────────────┴──────────┴────────────────────────────┴────────┴──────────┴─────────┴───────────┴──────────┴──────┴───────────────┘
Multi-threaded
┌───────┬────────────────┬──────────┬────────────────────────────────┬────────┬──────────┬─────────┬───────────┬──────────┬──────┬───────────────┐
│ Index │ Implementation │ Solution │ Label │ Passes │ Duration │ Threads │ Algorithm │ Faithful │ Bits │ Passes/Second │
├───────┼────────────────┼──────────┼────────────────────────────────┼────────┼──────────┼─────────┼───────────┼──────────┼──────┼───────────────┤
│ 1 │ dart │ 1 │ eagerestwolf&mmcdon20_1bit_par │ 144800 │ 5.00162 │ 32 │ base │ yes │ 1 │ 904.70615 │
│ 2 │ dart │ 1 │ eagerestwolf&mmcdon20_8bit_par │ 66589 │ 5.00639 │ 32 │ base │ yes │ 8 │ 415.65038 │
└───────┴────────────────┴──────────┴────────────────────────────────┴────────┴──────────┴─────────┴───────────┴──────────┴──────┴───────────────┘
Considering that the solution does finish, I am ok with re-enabling the parallel implementations in the benchmark. Feel free to update the Dockerfile accordingly.
@rbergen I have reenabled the parallel runners in the dockerfile.
Looks good to me. I'll try to keep an eye on the first full benchmark run with these updates in, tomorrow.
Description
Possible solution to #809. Dart 2.15 made changes to how isolates work, updating code to use new
Isolate.exit
command.Contributing requirements
drag-race
as the target branch.