Open GoogleCodeExporter opened 8 years ago
Original comment by yurkin
on 26 May 2009 at 7:29
Original comment by yurkin
on 12 Jul 2010 at 8:35
Another possibility is to two executables (e.g. current versus previous
release). The specified files from the produced results (e.g. stdout, log,
CrossSec-Y) are compared (identical or not). If difference is found, files are
sent to a graphical diff program (e.g. tortoisemerge), so one can quickly
evaluate whether the difference is significant or not. Spurious examples of the
latter are differences in the last digits or in values, which are negligibly
small (analytical zeros). I am currently using such technique for preparing to
release 1.0.
Advantage is there are no need either to parse parameters or to store database
of benchmark results. Thus a vast range of command line parameters can be
easily covered just by storing all variants of the command line.
Disadvantages are (advantages of the original idea)
1) Only parsing of results allows complete flexibility over e.g. which values
to compare and to what accuracy (i.e. what difference is considered natural,
and which is suspicious). The new proposed method allows choosing only the
files to compare. Thus it does not seem possible (with the new method) to build
up a fully automatic test suite, i.e. the one, which will not require user
intervention when program is working fine.
2) It will not allow testing compilation result on a new platform, because in
this case (verified) executable of the previous release are not readily
available.
Original comment by yurkin
on 5 Sep 2010 at 2:14
The test suite described in previous comment has been added to repository after
some improvements. See r1021. The main script is at tests/2exec/comp2exec.
Comments inside it should be sufficient to understand its usage.
Currently, it produces a lot of "false alarms" due to round-off errors, but
still can be used to perform an extensive test set in a short time (if GUI diff
program is used). The next step should be to replace literal comparison of
number-rich files (like mueller or CrossSec) by calculating differences of
numbers and comparing to some threshold. An example of such implementation is
provided by test scripts of near_field package (misc/near_field/RUNTESTS/)
Original comment by yurkin
on 16 Feb 2011 at 8:16
Original comment by yurkin
on 22 Apr 2011 at 2:40
r1107 greatly improves the performance of 2exec tests. It is now possible to
run it almost unattended to perform a thorough list of tests. So it solves the
problem of testing new releases against the previous ones.
Creating a test, which does not use a reference executable, is still desirable.
However, the priority of this is not that high.
Original comment by yurkin
on 9 Feb 2012 at 3:45
Original issue reported on code.google.com by
yurkin
on 26 Nov 2008 at 6:54