PlummersSoftwareLLC / Primes

Prime Number Projects in C#/C++/Python
https://plummerssoftwarellc.github.io/PrimeView/
2.46k stars 573 forks source link

dart: updated exit strategy in worker pool #811

Closed mmcdon20 closed 2 years ago

mmcdon20 commented 2 years ago

Description

Possible solution to #809. Dart 2.15 made changes to how isolates work, updating code to use new Isolate.exit command.

Contributing requirements

rbergen commented 2 years ago

@mmcdon20 I'll test your branch on the 5950X before continuing the review of this PR.

rbergen commented 2 years ago

@mmcdon20 Looks like we're not out of the woods yet. I've done some good old stdout.writeln debugging, and it looks like the hang occurs at the await Future.delayed.

I've decorated the main function in PrimeDartParallel.dart as follows:

Future<void> main() async {
  final processors = Platform.numberOfProcessors;
  final pool = await WorkerPool.init(numberOfWorkers: processors);
  stdout.writeln('After WorkerPool.init');
  final timer = Stopwatch()..start();

  pool.broadcast(const Start(work: work));
  stdout.writeln('After pool.broadcast Start');
  await Future.delayed(const Duration(seconds: 5));
  stdout.writeln('After Future.delayed');
  pool.broadcast(const Stop());
  stdout.writeln('After pool.broadcast Stop');

  timer.stop();

  stdout.writeln('After timer.stop');
  final duration = timer.elapsedMicroseconds / 1000000;
  final passes = await pool.passes();
  stdout.writeln(
      'eagerestwolf&mmcdon20_8bit_par;$passes;$duration;$processors;algorithm=base,faithful=yes,bits=8');
}

If I started a timed docker run of only PrimeDartParallel and kill the main PrimeDartParallel process after about a minute, this is the output I get:

$ time docker run dart1
After WorkerPool.init
After pool.broadcast Start

real    0m58.492s
user    0m0.014s
sys     0m0.028s

As you can see, the last line written to stdout is the one before await Future.delayed.

mmcdon20 commented 2 years ago

That is very surprising.

Does it hang if the program is just this?

Future<void> main() async {
  print('Start');
  await Future.delayed(const Duration(seconds: 5));
  print('Stop');
}
rbergen commented 2 years ago

That does work:

$ time docker run dart1
Start
Stop

real    0m5.462s
user    0m0.020s
sys     0m0.020s
mmcdon20 commented 2 years ago

@rbergen just pushed a rewrite of the parallel runners, maybe this implementation will work better.

rbergen commented 2 years ago

This yields the following result:

$ time docker run dart1
eagerestwolf&mmcdon20_8bit_par;66345;5.010128;32;algorithm=base,faithful=yes,bits=8

real    0m10.460s
user    0m0.020s
sys     0m0.021s

In other words: the PrimeDraftParallel.dart implementation does finish executing now. What I find interesting is that it seems to have an actual run time that is about double the 5 seconds it reports.

mmcdon20 commented 2 years ago

I had changed the way the timing is measured. Each isolate runner is now timing themselves and sending back their time, and I am printing the maximum of those times in the output. The discrepancy in time might be due to the additional time it takes to spawn the isolates, and combine the results.

rbergen commented 2 years ago

I've just performed a benchmark run of the updated solution:

$ make DIRECTORY=PrimeDart
make[1]: Entering directory '/home/rutger/Primes/tools/node_modules/node-uname/build'
  CC(target) Release/obj.target/uname/uname.o
  SOLINK_MODULE(target) Release/obj.target/uname.node
  COPY Release/uname.node
make[1]: Leaving directory '/home/rutger/Primes/tools/node_modules/node-uname/build'
added 231 packages in 2.224s
info: Unconfined mode: false
info: Detected architecture: amd64
info: [PrimeDart][solution_1] Building...
info: [PrimeDart][solution_1] Running...
                                                               Single-threaded
┌───────┬────────────────┬──────────┬────────────────────────────┬────────┬──────────┬─────────┬───────────┬──────────┬──────┬───────────────┐
│ Index │ Implementation │ Solution │ Label                      │ Passes │ Duration │ Threads │ Algorithm │ Faithful │ Bits │ Passes/Second │
├───────┼────────────────┼──────────┼────────────────────────────┼────────┼──────────┼─────────┼───────────┼──────────┼──────┼───────────────┤
│   1   │ dart           │ 1        │ eagerestwolf&mmcdon20_8bit │  6678  │ 5.00050  │    1    │   base    │   yes    │ 8    │  1335.46645   │
│   2   │ dart           │ 1        │ eagerestwolf&mmcdon20_1bit │  5757  │ 5.00033  │    1    │   base    │   yes    │ 1    │  1151.32332   │
└───────┴────────────────┴──────────┴────────────────────────────┴────────┴──────────┴─────────┴───────────┴──────────┴──────┴───────────────┘
                                                                  Multi-threaded
┌───────┬────────────────┬──────────┬────────────────────────────────┬────────┬──────────┬─────────┬───────────┬──────────┬──────┬───────────────┐
│ Index │ Implementation │ Solution │ Label                          │ Passes │ Duration │ Threads │ Algorithm │ Faithful │ Bits │ Passes/Second │
├───────┼────────────────┼──────────┼────────────────────────────────┼────────┼──────────┼─────────┼───────────┼──────────┼──────┼───────────────┤
│   1   │ dart           │ 1        │ eagerestwolf&mmcdon20_1bit_par │ 144800 │ 5.00162  │   32    │   base    │   yes    │ 1    │   904.70615   │
│   2   │ dart           │ 1        │ eagerestwolf&mmcdon20_8bit_par │ 66589  │ 5.00639  │   32    │   base    │   yes    │ 8    │   415.65038   │
└───────┴────────────────┴──────────┴────────────────────────────────┴────────┴──────────┴─────────┴───────────┴──────────┴──────┴───────────────┘
rbergen commented 2 years ago

Considering that the solution does finish, I am ok with re-enabling the parallel implementations in the benchmark. Feel free to update the Dockerfile accordingly.

mmcdon20 commented 2 years ago

@rbergen I have reenabled the parallel runners in the dockerfile.

rbergen commented 2 years ago

Looks good to me. I'll try to keep an eye on the first full benchmark run with these updates in, tomorrow.