ttsiodras / renderer

My software-only 3D renderer and raytracer (aka "TTSIOD renderer")
https://www.thanassis.space/renderer.html
GNU General Public License v3.0
167 stars 39 forks source link

Long Time No See aka ARM64 build #3

Closed MagaTailor closed 8 years ago

MagaTailor commented 8 years ago

I was going to try benchmarking an aarch64 Linux system but configure doesn't know about this architecture:

config.guess timestamp = 2006-07-02

uname -m = aarch64
uname -r = 3.14.65-61
uname -s = Linux
uname -v = #1 SMP PREEMPT Wed May 25 03:16:39 BRT 2016

UNAME_MACHINE = aarch64
UNAME_RELEASE = 3.14.65-61
UNAME_SYSTEM  = Linux
UNAME_VERSION = #1 SMP PREEMPT Wed May 25 03:16:39 BRT 2016
configure: error: cannot guess build type; you must specify one

Would it make sense to add that manually here?

ttsiodras commented 8 years ago

You'll need to pass a --host=... to your configure invocation. To find the value to pass as the argument, run gcc -v and check your Target:

$ gcc -v 2>&1 | grep ^Targ
Target: arm-linux-gnueabihf

Now pass that to your configure invocation - e.g. for the output above:

./configure --host=arm-linux-gnueabihf

...and continue with make as usual. Looking forward to hearing your benchmark results from yet another interesting platform! :-)

MagaTailor commented 8 years ago

Thanks for the suggestion, but it doesn't work either. Would upstream need to add the platform first?

$ ./configure --host=aarch64-linux-gnu

checking build system type... build-aux/config.guess: unable to guess system type

This script, last modified 2006-07-02, has failed to recognize
the operating system you are using. It is advised that you
download the most up to date version of the config scripts from

  http://savannah.gnu.org/cgi-bin/viewcvs/*checkout*/config/config/config.guess
and
  http://savannah.gnu.org/cgi-bin/viewcvs/*checkout*/config/config/config.sub
ttsiodras commented 8 years ago

Hmm...

Try re-creating the configure scripts from within your new ARM system - ie:

$ autoreconf
...
$ automake
...
$ ./configure --host=...
...
$ make

Needless to say, you'll be needing autoconf and automake installed first.

ttsiodras commented 8 years ago

...and if that still doesn't work, we're in uncharted waters; might as well try using the latest config.guess

MagaTailor commented 8 years ago

OK, so autoreconf fails with:

configure.ac:60: error: possibly undefined macro: AM_PATH_SDL
      If this token and others are legitimate, please use m4_pattern_allow.
      See the Autoconf documentation.

I'm running headless, is SDL necessary for building renderer from source?

ttsiodras commented 8 years ago

Headless is fine - "make bench" will use the dummy SDL video driver, which just benchmarks by rendering to a "memory-only" frame buffer. But yeah, you do need libsdl1.2-dev (or your distro's equivalent) to compile the renderer.

BTW, one more thing to try before replacing config.guess: Use ./configure --build=... instead of --host=... (with your TARGET value from gcc -v).

MagaTailor commented 8 years ago

I'll have to use the latest config.guess, --build didn't work either

checking build system type... Invalid configuration `aarch64-linux-gnu': machine `aarch64' not recognized
configure: error: /bin/bash build-aux/config.sub aarch64-linux-gnu failed
ttsiodras commented 8 years ago

I see. Note that you can do the "autoreconf" and "automake" steps in any machine you want (including e.g. your main x86 based Linux distro) and then package the output (i.e. the renderer's folder with everything inside - which will include your fresh new "./configure" and "config.guess").

MagaTailor commented 8 years ago

I've got configure running now... No wait:

configure: error: cannot guess build type; you must specify one
configure: error: ./configure failed for lib3ds-1.3.0

I'll have to replace config.guess over there too.

edit: Compiling now, stand by.

ttsiodras commented 8 years ago

Yep - lib3ds is a separate project that I just bundle in - and it comes with an ancient configure.ac.

You'll need to update that too.

Drumroll...

MagaTailor commented 8 years ago

OK, here's a result from a 1.5GHz Cortex-A53 based Android TV box (same as in Odroid C2): $ make bench (gcc 6.2)

35.6812
35.7092
35.7066
35.6888
35.7015
  Average value: 35.697460
  Std deviation: 0.012011
         Median: 35.701500
            Min: 35.681200
            Max: 35.709200

Thanks!

ttsiodras commented 8 years ago

Nice :-)

Though not much better than the result you reported for the C1 a year ago :-(

On a different topic - If you don't mind me asking, which Android TV box is it?

MagaTailor commented 8 years ago

It's a Beelink Mini MX III :)

Well, the S905 SoC is about low power/efficiency but a 1.3x gain is a little low indeed!

MagaTailor commented 8 years ago

If I wanted to copy the armv7 binary over and run it here - what's the direct command to run the benchmark?

ttsiodras commented 8 years ago
$ cd renderer/

$ make -n bench
make -C src bench

$ make -n -C src bench
make: Entering directory '/root/renderer/src'
for i in 1 2 3 4 5 ; do SDL_VIDEODRIVER=dummy ./renderer -b -n 500 ../3D-Objects/trainColor.tri | tail -1 | awk '{print substr($(NF-1),2);}' ; done | perl -e '$total=0; $totalSq=0; $n=0; my @allOfThem; while(<>) { print; chomp; $total += $_; $totalSq += $_*$_; $n++; push @allOfThem, $_; } my $variance = ($totalSq - $total*$total/$n)/($n-1); my @srted = sort {$a cmp $b} @allOfThem; my $len = scalar(@allOfThem); if ($len % 2) { $len++; } my @measurements = ( ["Average value",$total/$n], ["Std deviation",sqrt($variance)], ["Median",$srted[-1 + $len/2]], ["Min",$srted[0]], ["Max",$srted[-1]]); foreach (@measurements) { printf("%*s: %f\n", 15, $_->[0], $_->[1]);}'

So basically, if you don't care about gathering average and std deviation of execution speed through the perl code, you can just do a single run with:

$ cd src
$ SDL_VIDEODRIVER=dummy ./renderer -b -n 500 ../3D-Objects/trainColor.tri 
MagaTailor commented 8 years ago

Thanks, however to run the binary I'd need to install armv7 versions of every dynamically linked library. It would be more convenient if I could create a statically linked binary first :)

ttsiodras commented 8 years ago

Good luck :-) I don't believe I've ever linked with SDL statically - though I can see there's a libSDL.a there :-)

MagaTailor commented 8 years ago

No problem, I'll make it work, might come in handy later.

MagaTailor commented 8 years ago

My armv7 binary produced an Average value: 36.182620 running in aarch32 mode which is identical to native code.

Normally, when comparing the two ODroid's (Cortex A5 vs A53), a 1.5x speedup is expected, if no 64-bit benefit is present - do you think we're looking at some aarch64 GCC bug report material?

ttsiodras commented 8 years ago

Well, in all multithreaded implementations memory bus/speed may also be an issue.

Try running with OMP_NUM_THREADS set to 1 (so that only a single core will be used by the OpenMP runtime in both your ARM boxes). In theory, this single-threaded execution should allow you to see speed results that are the least influenced by memory bandwidth saturation.

(In case I wasn't clear - the resulting speed will be much slower than what you get now... but you will be able to compare your A5 to your A53 without memory saturation "clouding" the result).

MagaTailor commented 8 years ago

Thanks! I'll try that, apart from different gcc versions as well.

MagaTailor commented 8 years ago

Definitely a small aarch64 regression in gcc:

GCC5 Average value: 37.912720 GCC6 Average value: 35.691360 GCC7 Average value: 37.010760

Single-threaded performance looks like this: 8.5 vs 10.4 (A5@1.73Ghz and A53@1.5Ghz respectively) which looks a little low.

BTW, I've managed to squeeze a 29.06 average on my C1 using GCC6 and better flags. That's called progress!

ttsiodras commented 8 years ago

Indeed - the difference between A5 and A53 is quite smaller than I thought... And good to hear that results in C1 are improved with GCC6.

Thanks for all the info, much appreciated!

MagaTailor commented 8 years ago

My pleasure ;) I'll probably do some profiling and give the GCC guys a quick heads-up.

Upstream issue: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77546

ttsiodras commented 6 years ago

If you can, please try this small update:

https://github.com/ttsiodras/renderer/archive/v2.3b.tar.gz

I just packaged the new config.guess, so (I hope) it works now as-is in Aarch64 targets.