Open borismarin opened 9 years ago
Do you think we can get this fixed before SfN? Weren't you working on exactly this kind of thing with the OSM fork?
Well, the OSB model validator is mainly concerned with testing models, not the simulators themselves. There are some test-related rules in the Makefile, I wonder if Dave knows what those are supposed to do, and if they can be reused somehow?
I was just looking through old notes from 2.3 beta testing, and trying to recover what we did. There was a genesis/tests/TestSuite directory that was not included in the final release. I'll come up with some tests to recommend later today.
I looked at the old TestSuite, and don't think it is worth reimplementing now. Here are some simple tests that should do for now:
Suggested tests for accuracy;
(1) Test the Scalable Portable Random Number Generator (SPRNG) with the following commands:
genesis #1 > setrand -sprng genesis #2 > randseed 0 genesis #3 > echo {rand 0 1} {rand 0 1} {rand 0 1} {rand 0 1} {rand 0 1}
Regardless of platform, you should get the results:
0.01426654216 0.7493918538 0.007316101808 0.1527428776 0.1134621128
(2) Test the rallpacks 'axon.g' simulation:
cd rallpack/reports/genesis-2.0/rallpack3
run axon.g, which will produce the files axon.out0 and axon.outx
The last entries should be: $ tail -n 5 axon.out0 0.24985 -0.0180403 0.2499 -0.0201349 0.24995 -0.0221736 0.25 -0.02416 0.25005 -0.0260978
$ tail -n 5 axon.outx 0.24985 -0.0749142 0.2499 -0.0748635 0.24995 -0.0748067 0.25 -0.0747422 0.25005 -0.0746679
I get the same results for both genesis executables, but the one produced by configure takes 0.351000 cpu seconds, and the one from the edited Makefile.dist takes 0.267000 cpu seconds (Note that the README says Upi set a record of 3.17 sec, back in 2006)
We should look at the optimization flags
I've adapted (f76929a) the travis test to compare the output of sprng to the expected values mentioned above. The builds will fail if the comparison fails, so we now have a better indicator (well, at least better than just checking if make is returning 0) of the validity of the binaries. I'll eventually implement the rallpack check. BTW, @dbeeman, I'm getting different results for that in my machine (for both autoconf and Makefile.dist methods):
$ tail -5 axon.out0
0.24985 -0.018699
0.2499 -0.020776
0.24995 -0.022798
0.25 -0.0247688
0.25005 -0.0266924
$ tail -5 axon.outx
0.24985 -0.0748959
0.2499 -0.0748435
0.24995 -0.0747845
0.25 -0.0747171
0.25005 -0.0746393
Do you get the same results with both methods? As you know, round off errors are tricky when looking at the last APs in a run. This may not be too surprising, but I'll have to look at runs on different machinesl. Also, the GENESIS SLI casts internal doubles to floats.
The configure script is not using the '-O2' option, and it links -lSm and -lICE, which are not needed for any modern Linux. These probably account for the file size and speed. We should definitely use the optimization. The overhead of the two other libs isn't so bad, but what happens if they are not installed?
The configure script now uses '-O2' as of af6b98801e1fdbedba000015e721ef34de661860, libSM and libICE were also removed in the same build.
My output is shifted by 0.00005 (seconds?):
$ tail -n 5 axon.out0
0.24985 -0.0201349
0.2499 -0.0221736
0.24995 -0.02416
0.25 -0.0260978
0.25005 -0.0260978
and here it simply differs slightly:
$ tail -n 5 axon.outx
0.24985 -0.0748635
0.2499 -0.0748067
0.24995 -0.0747422
0.25 -0.0746679
0.25005 -0.0746679
What does that mean? (I did not try all different combinations of compilers & optimisations)
I tried this on a colleague's Linux machine (recent Ubuntu x86_64) and it produces the same numbers as in my case. Any ideas why?
We need to have a systematic way of testing the builds for correctness. Rallpack? The travis script (#21) needs to be adapted to run such tests.