hercules-390 / hyperion

Hercules 390
Other
248 stars 67 forks source link

Runtest.cmd (Windows) includes ostailor null command and runtest (Linux) does not. #165

Closed srorso closed 7 years ago

srorso commented 7 years ago

The Windows version of runtest includes an ostailor null command between each test case, and the linux shell script does not. When a test case uses ostailor quiet, subsequent test cases that depend on program check detection via *Program report failed compares under *nix. The same test cases pass on Windows.

In the current test case stream, six cases fail for this reason when the BFP test cases are included. The test cases are:

pfpo-esa     
ssk370         (2 of 2 cases)
sske           (1 of 2 cases in the file)
sske370        (1 of 12 cases in the  file)
sske390        (1 of 12 cases in the file)

BFP test cases include ostailor quiet because they intentionally create program checks to verify floating point results when traps are enabled. There is no value to including the program check event in the alltests.out log.

While this issue was discovered on the Hyperion that included Softfloat mods, it was reproduced on an unmodified Hyperion by excluding ostailor null from runtest.cmd and including ostailor quiet in a test case near the beginning.

I propose that the runtest shell script be modified to include ostailor null between each test case. Likewise for numcpu 1 and mainsize 2m.

Alternatively, we could remove the ostailor null etc. between test cases from runtest.cmd and create the expectation that test case developers include ostailor null at the end of the test case should they alter the ostailor setting in the case. But my preference is the addition of ostailor null to the shell script.

And that's the core of the issue. How should these two scripts be made consistent with respect to their processing of test cases?

jphartmann commented 7 years ago

I shall leave that for Fish to sort out.

srorso commented 7 years ago

In the meantime, I will add ostailor null commands at the end of my test cases so that there is no impact on test cases that follow the BFP cases. And I will add numcpu 1 to the bfp-023-threads case.

Fish-Git commented 7 years ago

I propose that the runtest shell script be modified to include ostailor null between each test case. Likewise for numcpu 1 and mainsize 2m.

I agree 100%. Each test should be guaranteed to start from a known well defined state. If they wish or need to change that state, fine, they are free to do so. But regardless of whether they do or don't, the next test should start from the same know default initial state.

Automatically inserting ostailor null and numcpu 1 and mainsize 2m (and possiblysysclearorsysreset`), etc, statements before each test should go a long way towards honoring such a guarantee, but it still won't guarantee a complete 100% known state. A given test can define/attach a device and forget to detach it for example, or load an alternate dynamically loadable module for example (such as altcmpsc.dll or even dyninst.dll) and forget to unload them. Etc.

So guaranteeing a 100% know starting state would be very difficult to do IMHO.

But certainly adding a few ostailor, numcpu and mainsize, etc, statements should serve to go a long way towards reaching that goal and would be much better than what we have today.

I'll see if I can tweak the Linux runtest script to cat each .tst test file individually so we can insert the needed statements just like the Windows version is already doing.

And that's the core of the issue. How should these two scripts be made consistent with respect to their processing of test cases?

See above. IMO certain statements should be automatically inserted at the beginning of each test. The individual tests themselves shouldn't, for the most part, have to worry about resetting things. Only those state changes that are known to not get reset between each test should each test be responsible for resetting once they're complete. Other simple state changes they shouldn't have to worry about.

Once we decide what we want to do we should probably document in runtest's README what the starting state for each test is guaranteed to be.

Fish-Git commented 7 years ago

I'll see if I can tweak the Linux runtest script to cat each .tst test file individually so we can insert the needed statements just like the Windows version is already doing.

Done and tested. Works fine. Will commit and close this issue.

Fish-Git commented 7 years ago

More work remains:   (low priority)

We still need to decide/document what our starting machine state should be for each test and how to ensure such a state. The implemented fix is "good enough" for now, but we need a more comprehensive solution.

jphartmann commented 7 years ago

I shall revert the UNIX runtest to its previous state since the premise that all tests must start in the same state does not hold, nor does the premise that all tests are in separate files. Thus, the change does not address the perceived problem.

If you need to bring Hercules back to a particular state in several test cases, use script <yourprofile> to issue whatever series of commands you require when you require them. You might consider Fish's trick of setting variables to parameterise the embedded commands; see, e.g., awsbsf.tst.

You can even source the script file multiple times within a test file, which then happens automatically whichever way you might aggregate test files.

Make the file name unique (steves.own.profile will no doubt work) and commit it to the tests directory.

The root problem is that Hercules is unable to detect the beginning of a test case since *Testcase is a loud comment as far as Hercules is concerned and the runtest console command, which is the first Hercules will know about a test case, is far too late; doing anything to reset at that time is bound to be counterproductive.

It is regrettable that Fish changed the Windows behaviour against my advice some months ago. Apparently he forgot that he had agreed not to.

It might be a good idea if someone who is windows savvy could remove the the corresponding windows feechur so that runtest works the same on UNIX and Windows, lest test cases written on Windows fail in obscure ways on UNIX.

srorso commented 7 years ago

I will address this for Windows environments.

ghost commented 7 years ago

On 23 Nov 2016, at 13:46, Stephen Orso notifications@github.com wrote:

I will address this for Windows environments.

Thank You Steve e

srorso commented 7 years ago

My proposed correction for this issue includes an enhancement, and I seek comment on same. All elements of this proposal have been tested in my fork of Hyperion under Windows. Specifically:

1) Correction: Remove the generation of mainsize 2, numcpu 1, and ostailor null commands between test case files. This element conforms behavior of the Windows runtest.cmd command to that of the non-Windows runtest shell script. We shall leave the discussion of what should be done automagically between test case files (or *Testcase instances, for that matter) to the developers' forum.

2) Correction: Change the generation of the alltests.tst consolidated test case file so that the initial test case is not repeated ~2.5 times.

3) Enhancement: Add the following loud comments ( not #) to the consolidated test script file so that when reading the file or the test results file alltests.out, one can easily identify which test case file a particular `Testcasecame from. The resulting comment looks like this inalltest.out`:

    HHC01603I * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
    HHC01603I *
    HHC01603I * Start of test file .\sske.tst
    HHC01603I *
    HHC01603I *Testcase sske#1 processed 16 Jan 2016 12:11:11 by bldhtc.rexx
    HHC01603I * sske#1:  Test for sske being privileged.

I also propose that I change the runtest shell script to include element 3 functionality.

Because a test case file can include multiple Testcases, there can be no assumption that the Testcase name matches the file name. This really complicated the original diagnosis of this issue.

If there is no consensus on the enhancement, I will be happy to commit the two corrections without the enhancement and close this issue.

Comments welcomed and appreciated.

Best Regards, Steve Orso

jphartmann commented 7 years ago

1: Thanks.

3. Sounds good to me, but please leave runtest alone for the moment since I'm currently hacking a bit on it.

There is a problem though; right now the actual list of files is constructed by the shell's file name globbing of the arguments to the cat command. #2 seems to indicate that this was not done right on Windows?

I suggest you also add information that can, on UNIX, be obtained by the stat command, such as file date (modification) and file size. If you can somehow find the full path to the file, that may be an advantage too. I forget the commands, but I can supply them later.

srorso commented 7 years ago

Re runtest shell script: not a problem; I shall hold off for now.

Re Your suggestions for path and date/time; Love it...I'll figure it out and have fun besides.

Fish-Git commented 7 years ago

@srorso : Regarding item 2:

  1. Correction: Change the generation of the alltests.tst consolidated test case file so that the initial test case is not repeated ~2.5 times.

Did this problem happen on Windows or only on Linux?

Also, do you run your tests (runtest command) from within the tests subdirectory itself or completely outside of Hercules source code directory?

I ask because I'm trying to understand under what conditions the problem occurs. In all my testing I have personally never seen this phenomenon occur.

But then I always run my runtest command completely outside of the source code directory too, similar to the way you do an out-of-source build, so maybe that has something to do with it?

Thanks.

srorso commented 7 years ago

Hi Fish:

Windows only, and it occurs because I am running from within the tests directory.

The glob for test cases, when done inside the tests directory, picks up the under-construction alltests.tst file as an input test case and appends itself to itself. If the output file had been named zalltests.tst, every test case would have been duplicated, perhaps some of the early ones more than once.

Changing the test case accumulation process to output to alltests.temp and then renaming to alltests.tst addresses the issue.

Best Regards, Steve Orso

srorso commented 7 years ago

3) Enhancement results incorporating John's excellent suggestion:

[...]
HHC01603I * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
HHC01603I *
HHC01603I * Start of test file   bfp-002-loadr.tst   11/06/2016 02:00 PM   C:\Users\srorso\Documents\GitHub\hyperion\tests\
HHC01603I *
HHC01603I *Testcase bfp-002-loadr.tst:      LEDBR, LEDB, LEXBR, LEXB, LDXBR, LDXB
HHC01603I sysclear
[...]

It wasn't difficult to find the needed codes.

Fish-Git commented 7 years ago

Windows only, and it occurs because I am running from within the tests directory.

Ah. That makes prefect sense then.

And yes, your proposed fix is exactly the way I would have fixed it.

Thanks.

jphartmann commented 7 years ago

Steve, updated runtest. See if you like it.

srorso commented 7 years ago

I am flattered. And I will shamelessly steal your enhancement, the addition of "date" and "in" to make the " Start of test file" line, an enhancement that makes the line parse-able. My theft also makes output from UNIX and OS/X runtest match that created by Windows runtest.cmd.

I will commit mine tomorrow.

srorso commented 7 years ago

Closed by commit 40f647e9.