philburk / pforth

Portable Forth in C
BSD Zero Clause License
597 stars 98 forks source link

benchmarks need updating for today's faster processors #130

Open mschwartz opened 1 year ago

mschwartz commented 1 year ago

WOW. Not sure how you are supposed to run the benchmarks in bench.fth, but I just added a line at the end to call one of the benchmarks and use the unix time command to run the benchmark like this:

time ./build/unix/pforth_standalone fth/bench.fth

So I edited bench.fth and added to the end of the file:

cr ." bench1" cr
BENCH1

Then edited and changed to bench2, etc.

All this is on my M1 Max MacBook Pro. All sub .1 seconds. Amazing how slow the machines you used to use to benchmark pForth! :)

bench1:

time ./build/unix/pforth_standalone fth/bench.fth
PForth V28-LE/64, built Dec 11 2022 11:52:21 (static)

Including: fth/bench.fth

bench1
./build/unix/pforth_standalone fth/bench.fth  0.03s user 0.00s system 98% cpu 0.036 total

bench2:

time ./build/unix/pforth_standalone fth/bench.fth
PForth V28-LE/64, built Dec 11 2022 11:52:21 (static)

Including: fth/bench.fth

bench2
./build/unix/pforth_standalone fth/bench.fth  0.05s user 0.00s system 98% cpu 0.050 total

bench3:

time ./build/unix/pforth_standalone fth/bench.fth
PForth V28-LE/64, built Dec 11 2022 11:52:21 (static)

Including: fth/bench.fth

bench3
./build/unix/pforth_standalone fth/bench.fth  0.05s user 0.00s system 93% cpu 0.050 total

bench4:

time ./build/unix/pforth_standalone fth/bench.fth
PForth V28-LE/64, built Dec 11 2022 11:52:21 (static)

Including: fth/bench.fth

bench4
./build/unix/pforth_standalone fth/bench.fth  0.04s user 0.00s system 97% cpu 0.039 total

bench5:

time ./build/unix/pforth_standalone fth/bench.fth
PForth V28-LE/64, built Dec 11 2022 11:52:21 (static)

Including: fth/bench.fth

bench5
./build/unix/pforth_standalone fth/bench.fth  0.05s user 0.00s system 98% cpu 0.054 total

sieve:

time ./build/unix/pforth_standalone fth/bench.fth
PForth V28-LE/64, built Dec 11 2022 11:52:21 (static)

Including: fth/bench.fth

sieve cr10 iterations 
1899 primes 
./build/unix/pforth_standalone fth/bench.fth  0.01s user 0.00s system 94% cpu 0.016 total
mschwartz commented 1 year ago

I wonder if the time command results is that most of the time consumed is the actual loading and executing the pforth_standalone binary/executable before the interpreter even starts running.

mschwartz commented 1 year ago

or maybe not. Removed running any of the benchmarks in bench.fth entirely:

time ./build/unix/pforth_standalone fth/bench.fth
PForth V28-LE/64, built Dec 11 2022 11:52:21 (static)

Including: fth/bench.fth
./build/unix/pforth_standalone fth/bench.fth  0.00s user 0.00s system 79% cpu 0.003 total
philburk commented 1 year ago

Amazing how slow the machines you used to use to benchmark pForth! :)

I first started writing Forth compilers on VAX and Amiga that ran at one or two MIPS. Then I wrote pForth on an 80 MHz Pentium. The benchmarks were written for the Amiga JForth. JForth would compile down to native 68000 instructions so it was very fast.

I was worried about losing performance in pForth because it uses a C based inner interpreter. I ran the sieve a few years ago and pForth was 1000X faster than the Amiga. So I don't worry about performance much any more.

But I should update the benchmarks so that they are more meaningful o todays systems. Even a Raspberry PI is much faster then the Amiga.

I like the idea of using 'time'. Does it handle the fact that modern CPUs do frequency scaling so performance can vary widely depending on the system load? Benchmarks can actually run faster on a heavily loaded system because the CPUs are revved up.

mschwartz commented 1 year ago

I got my first Amiga in 1985 and contributed a few projects to the Fish collection. I wrote the debugger for Matt Dillon’s Dice C compiler, and probably a million lines of assembly language for various processors (video games).

I was well aware of JForth at the time. I think it was a brilliant project. While the source was Forth, I saw it as an uber powerful macro assembler.

I see a lot of jforth in pForth. Not the inline code generation, but the save-forth, anew, kinds of things. I see huge value in pForth still.

what jforth had that pforth lacks is tight integration with AmigaOS. You could write virtually any kind of Amiga program using it. Interface with AllocMem and FreeMem alone is a really big deal!

While pForth has the ability to augment the C words with custom C functions, it’s a lot of work to bring it to the level of jForth.

Modern programs need network API, JSON, XML, and async kinds of capabilities. That’s just for headless server applications.

For GUI, you need access to all the structures and message passing required for Cocoa or Qt or whatever. Sounds like months or years of work!

forth strings are limited to 255 length, which is puny by today’s standards. JSON strings and XML text are almost always going to be longer strings.

I’m not sure what has developed in the Forth world to deal with it all.

I have a lot of free time since I retired in April. Just getting around to one of my bucket list items - pForth!

philburk commented 1 year ago

I'm glad you liked JForth. It was designed to be a full featured IDE for the Amiga.

PForth was designed for testing ASICs and for running on minimal systems. PForth even has its own memory allocator for platforms with no malloc.

There are other full-featured Forth, which are great. But they often do too much and are hard to build for minimal systems that do not have a lot of OS support.

mschwartz commented 1 year ago

FWIW, I added some custom C functions a while back (to send/receive data via MQTT) and it worked fine. However, the scheme for adding these functions seems a bit clunky. I'd like to look at a way we can conditionally add things, as specified in the makefile.

What I remember, it was very specific to MQTT to add the MQTT words. Added the MQTT words in two places, I believe, and kind of hard coded the indexes kind of thing. I'm thinking there might be a way to automate this in a way you can add "modules/packages" at compile time and it just works with no code modification.

Another approach is to support dlopen() and .so/,dylib/.dll (maybe not .dll on windows or embedded).

Maybe both.

The words can be discovered from the .so by loading it and calling a function within it.

Thinking out loud here...