PROBLEM: z/Arch reuse of S/370 opcodes for entirely new purpose

Fish-Git commented 2 years ago

Hi guys! I'm purposely posting this as a publicly viewable GitHub Issue so our user community can chime in with their own thoughts and opinions on the subject too if desired.

I was working on providing support for a new z/Arch facility described in the latest Principles of Operation: facility 193: the BEAR-Enhancement Facility.

It's pretty simple and straightforward and involves only 3 new instructions: LOAD BEAR (LBEAR = B200), STORE BEAR (STBEAR = B201) and a new SIY format LOAD PSW EXTENDED (LPSWEY = EB71) instruction that doesn't cause a breaking event.

But almost as soon as I started, I ran into a snag:

static INSTR_FUNC gen_opcode_b2xx[256][NUM_INSTR_TAB_PTRS] =
{
 /*B200*/ GENx370x390x900 ( "CONCS"     , S    , ASMFMT_S      , connect_channel_set                    ),
 /*B201*/ GENx370x390x900 ( "DISCS"     , S    , ASMFMT_S      , disconnect_channel_set                 ),
 /*B202*/ GENx370x390x900 ( "STIDP"     , S    , ASMFMT_S      , store_cpu_id                           ),
 /*B203*/ GENx370x390x900 ( "STIDC"     , S    , ASMFMT_S      , store_channel_id                       ),

The change appears to have been made by Harold back in 2010.

I don't fully understand why we need to be able to (or would want to be able to) support S/370 channels in ESA/390 and/or z/Arch mode. Harold explains his motivation and justification in his README.IOARCH internal readme document, but call me stupid: I'm just not getting it. I'm not seeing the need/justification. Especially given that 37X ("370 backport") support in Hercules already existed in Hercules for years prior to Harold's change. So why would we need to have 390 and z/Arch support the CONCS and DISCS, etc. instructions?

I'd like to completely rip out support for this very unusual type of build of Hercules, but Harold's document would seem to imply it was done because our community wanted/needed it. But is that still/actually true? Are there people out there actually building such unusual builds of Hercules? If such support were ripped out, would I be pissing off an unknown segment of our user community?

SECOND (and much more important IMO) QUESTION: How do you feel would be the best way to deal with this opcode conflict? We obviously need to still continue supporting both the CONCS and DISCS instructions for S/370, but the new z/Arch LBEAR and STBEAR instructions are using the same opcode: B200 and B201 respectively. How should I deal with this situation? Has this situation occurred before for anyone? How did you deal with it?

I'm thinking of changing both opcodes to instead call a new "b200_instruction" and "b201_instruction" DEF_INST instruction function, and then within that new instruction, call the existing "connect_channel_set" and "disconnect_channel_set" instruction functions if the archmode is 370 or 390, or the new "load_bear" and "store_bear" instructions if the archmode is z/Arch.

Of course doing that would obviously break Harold's existing "S/370 channels in z/Arch" design too (hence the reason for this GitHub Issue!), but I personally see no other choice. <shrug>

Help?

(Harold? @s390guy)

s390guy commented 2 years ago

This implemented a strategy towards moving the community forward while keeping one foot in the future and one in the past. There were a lot of discussions at the time. Two fundamental advances were sought by different people. Some wanted more memory than 24 bits allows (the 380 group). Others wanted to use the new instructions. The instruction need was satisfied by the S/370 instruction extending module. The more memory crowd, ended up going the 380 path. I believe the real difference in instruction enhancement for S/370 was that a module was developed rather than the need to compile the extended instructions. My memory could be faulty on that. But those were the two desired capabilities and two solutions for providing them.

The forward-porting of channel-based I/O was an alternative to both solutions. By moving to 31- or 64-bit addressing the need for more memory was satisfied. And the new instructions came along for the ride. It was largely an alternative to the 380 path which had large opposition by the then Hercules developers. What you are seeing is the remnants of this alternative path.

The 380 path seems to have now completely separated itself from legacy Hercules. The 380 path was never embraced by legacy Hercules so is not really part of this discussion.

Today, the "strange" I/O design you see here has been demonstrated as one that is not feasible for the community. Laddi Hanus did that work. My personal view is, change it at will. I believe removing the x390 and x900 designations should free up the opcodes for use in the x900 architecture. As I recall these designations are used to determine into which of the three opcode tables the instruction is to be placed. So, just add new entries enabled for x900 for the new instructions should do the trick. This results in two entries for the opcodes but they go in different tables. The S/370 instructions (the ones you copied in the posting) become x370____. The new instructions would be ____x900. That is how I would initially change the entries and see if that works.

If this design is no longer the case (or it does not work as I remember), then it will be more work.

Regardless, IMO, feel free to change those settings at will, or anything else left over from the related compilation changes for the "forward porting" of channel-based I/O. I believe there was a feature related to that as well.

I agree that we should NOT kill channel-based I/O in the S/370 architecture. Quite a few people would be upset to say the least.

Harold Grovesteen

Fish-Git commented 2 years ago

The forward-porting of channel-based I/O was an alternative to both solutions.

Today, the "strange" I/O design you see here has been demonstrated as one that is not feasible for the community.

My personal view is, change it at will.

Thanks, Harold. I was hoping you would say that. I'll wait a few days first to give our community a chance to object should they want to do so. But if no one officially objects within a few days, I will go ahead and make the changes.

S/370 instructions (the ones you copied in the posting) become x370____. The new instructions would be ____x900. That is how I would initially change the entries and see if that works.

I don't think that's going to work. I don't believe we can have two different opcode table entries for the same opcode. If I am understanding you correctly, what you are proposing is doing:

static INSTR_FUNC gen_opcode_b2xx[256][NUM_INSTR_TAB_PTRS] =
{
 /*B200*/ GENx370x___x___ ( "CONCS"     , S    , ASMFMT_S      , connect_channel_set                    ),
 /*B200*/ GENx___x___x900 ( "LBEAR"     , S    , ASMFMT_S      , load_bear                              ),
 /*B201*/ GENx370x___x___ ( "DISCS"     , S    , ASMFMT_S      , disconnect_channel_set                 ),
 /*B201*/ GENx___x___x900 ( "STBEAR"    , S    , ASMFMT_S      , store_bear                             ),
 /*B202*/ GENx370x390x900 ( "STIDP"     , S    , ASMFMT_S      , store_cpu_id                           ),
 /*B203*/ GENx370x___x___ ( "STIDC"     , S    , ASMFMT_S      , store_channel_id                       ),

Yes?

But as you can see, that's not going to work! This is a fixed length 256 entry table, and by duplicating an entry, all of the remaining entries are going to be wrong. That's why I made my earlier suggestion of changing it to call a brand new function instead, and then let that function decide which instruction function should actually be called.

I'm not sure how much effort my design is going to take to fix the instruction tracing side of things either, but I'll cross that bridge when I come to it.*`()`**

Thanks for your feedback, Harold. Much obliged!

Anyone else care to chime in?

*`()`** _Right now, with my proposed design, an instruction trace is always going to display e.g. "B200" for the instruction's mnemonic and "b200instruction" as the name of the instruction. I'm perfectly willing to live with that if fixing it to display properly for each architecture turns out to be too kludgy/complicated.

s390guy commented 2 years ago

static INSTR_FUNC gen_opcode_b2xx[256][NUM_INSTR_TAB_PTRS] =
{
 ...

It is very obtuse coding, but essentially a lot of Hercules is compiled three times. I thought that was true for all instruction tables. The primary difference is that different options (and tables) exist for each architecture. Now, maybe it is different. But while putting multiple entries together looks wrong, they would actually compile correctly for the separate architectures assuming my remembrances are close to correct. Also where does NUM_INSTR_TAB_PTRS come into play? While each opcode in the b2xx group has a single entry, the table looks like a two dimensional table, so each architecture could have its own entry.

At one time there seemed to be a separate ESA/390 architecture. It no longer seems to exist. If you ask for ESA/390, you get actually z in ESA/390 mode. In the past that was not true. One would need to request z to get ESA/390 on z. I have no desire to debate whether that is right or wrong. But something that caused that change might be contributing to my confusion. Ignore me if I am wrong in this paragraph.

I fear over time we have lost some generality and flexibility in the code.

While your approach to the instructions, namely combine them into a single function, I understand, that approach is quite a divergence from what has been done in the past. I would hope there is some place that you can cause correct tracing.

This is a somewhat limited problem, as far as the B2xx opcodes go. Only three opcodes were introduced by S/370 that have not been carried forward, namely these three: B200, B201, and B203.

In the future, similar issues could arise with 9Cxx, 9Dxx, 9Exx and 9Fxx. S/370 uses 00 and 01 for each instruction group.

PoO.txt

The link above is to a report that I create for all of the MSL files and instruction definitions whenever I change them. Admittedly I am a bit behind on ASMA instructions. They are like the instruction lists in the PoO, hence the file name. Anyone curious or interested should keep this file handy. It proves to be very useful. At least I do.

Harold Grovesteen

Fish-Git commented 2 years ago

It is very obtuse coding, but essentially a lot of Hercules is compiled three times.

Correct.

I thought that was true for all instruction tables.

That has never been true. Not since day one.

Yes, the opcode.c source file is indeed compiled three times, but the opcode tables portion of the file is not. The opcode tables in opcode.c are compiled only once, during compilation of the very last defined build architecture (which, even though today it always is, does not necessarily have to be z/Arch), and thus cannot contain any architecturally-dependent code or constructs.*`()`**

The primary difference is that different options (and tables) exist for each architecture.

Not true.

Now, maybe it is different. But while putting multiple entries together looks wrong, they would actually compile correctly for the separate architectures assuming my remembrances are close to correct.

Wrong. Perhaps what you are remembering is the Hercules start-up/initialization code that dynamically constructs each architecture's separate opcode jump table? (i.e. the actual table that our runtime instruction dispatch logic uses). Those "opcode" tables are indeed different for each architecture (they have to be!), but each are dynamically constructed at runtime based on the "master" opcode tables you see in opcode.c.

Also where does NUM_INSTR_TAB_PTRS come into play?

The master opcode table in opcode.c is two dimensional. The first dimension is the opcode itself. The second dimension contains the actual three separate pointers to the individual instruction functions (one for each architecture) as well as some additional helpful pointers used mostly for instruction tracing:

https://github.com/SDL-Hercules-390/hyperion/blob/d0ccfbc9a74dd0a77a75ac224025fc68f6a6617f/opcode.h#L133-L140

Again, this has been the master opcode table's design almost since day one. `()`**

While each opcode in the b2xx group has a single entry, the table looks like a two dimensional table, so each architecture could have its own entry.

Correct! As just explained.

At one time there seemed to be a separate ESA/390 architecture. It no longer seems to exist. If you ask for ESA/390, you get actually z in ESA/390 mode. In the past that was not true. One would need to request z to get ESA/390 on z. I have no desire to debate whether that is right or wrong. But something that caused that change might be contributing to my confusion. Ignore me if I am wrong in this paragraph.

I suspect you are wrong about that.

Or at least, I hope you are wrong, because I certainly wasn't aware of any such thing!! 8-O

I fear over time we have lost some generality and flexibility in the code.

I certainly hope not. Because that would mean we're going backwards. That would mean Hercules is in worse shape than it was originally, which would be a Very Bad Thing. Rather, I would like to think Hercules is actually in much better shape than it was originally. I would like to think we have, over time, improved Hercules. If the changes that I and/or others having been making are actually hurting Hercules (making the code worse off), then I fear it's time for me to retire as a Hercules developer! I would never want to hurt Hercules! :-(

While your approach to the instructions, namely combine them into a single function, I understand, that approach is quite a divergence from what has been done in the past. I would hope there is some place that you can cause correct tracing.

Aye. Me too.

This is a somewhat limited problem, as far as the B2xx opcodes go. Only three opcodes were introduced by S/370 that have not been carried forward, namely these three: B200, B201, and B203.

In the future, similar issues could arise with 9Cxx, 9Dxx, 9Exx and 9Fxx. S/370 uses 00 and 01 for each instruction group.

Yes, I noticed that, and that is my concern. It is entirely feasible that over time, IBM will begin reusing more and more 370-only opcodes for z/Architecture, so we should probably start planning/discussing how best to deal with it now, rather than wait until it becomes a real problem later. That's one of the reasons I brought it up.

Thanks again for your feedback, Harold. Much appreciated.

Anyone else?

*`()** _Each multiply-compiled source file has an architecture-dependent part, and a **non-** architecturally-dependent part. The architecturally dependent part comes first, before the point where_GENARCHis changed to the next build architecture and the source file is then re-#includedagain for the newly#defined` build architecture. Once this beginning part of the file (the first part) has thus been compiled three times (once for each build architecture), then and only then is the remaining part of the file then compiled. This remaining part of the file (the second part) is the non- architecturally-dependent part, and cannot contain any architecturally-dependent code. To do so would be a gross violation of Hercules's overall design established on day one.

_Some time ago, I created a small, helpful(?) "dummy" source file that is never compiled, called _archdep_templ.c in an effort to help illustrate (to newly minted Hercules developers who might not be familiar with it) how this all works (i.e. "how it all hangs together"), because I agree it is somewhat "obtuse" as you say (unusual and difficult to understand)._

`()`** On day one, each entry only contained the three pointers to each architecture's instruction function, but shortly after Jan added the other pointers too.

I myself would like to add yet another pointer, pointing to the function to decode the instruction based on its format, before jumping to the actual instruction function itself, thereby relieving the need for each individual instruction function from having to decode the instruction itself. As soon as it got called, the instruction would already be decoded.

After all, there are, for example, over 200 instructions in the architecture using the RRE format, so why should each of them be forced to decode ("crack") the instruction itself? That's a lot of redundant code and a HUGE waste of host processor instruction cache! That needs to change IMHO!

s390guy commented 2 years ago

As you have outlined things today (forgive my faulty memory of V3 days when this change was introduced), the current design really only gives you one option, namely conditionally compile the instruction for both S/370 and z with x370____x900. The conditional compilation of the function gives you two separate sets of functionality, no? And if not, then you need to do it during execution of the instruction.

Somewhere the instruction name for tracing is a string. Replacing the string for the architecture's name should be possible. This is all a bit kludgey but the design leaves few options.

With the random changes, if you will, in instruction definition over the decades, it is very difficult to avoid some "fix-up" code for the introduced anomalies.

I very much like your idea around instruction format decode. This has been an area I have wished could be changed to how you suggest it should be for a long time. As you say, this is an area of huge waste of host processor cache. Processor cache failures is a major contributor to Hercules performance. A way to migrate such a change would be of huge value. Reserving an area in each CPU for instruction operands would further reduce cache misses. All instructions use the same area for operands. Just my thoughts on that part. It might even make sense to fetch the instruction into an area also in the CPU from which it is decoded.

s390guy commented 2 years ago

We did not really talk about it explicitly. But each of the three functions generated for the table are unique. The _GEN370, _GEN390, and _GEN900 add a unique prefix to _ifunc_name. Normally, the function generates variations on the same instruction for the different architectures. However, there is nothing stopping the function for s/370 and z from having entirely different functionality. I suspect that is what you had in mind all along, but it was never explicitly stated. Suspect the S/370 and z instructions have different instruction formats. That will need accommodating. Usually in cases where the instruction is simply brought forward for backward compatibility, the instruction format is the same. In this case, and the other potential ones, the z instruction format may be different than the s/370 instruction.

As you said, that leaves instruction tracing. This might be a case where the string for the instruction mnemonic gets set for the entry depending upon which architecture has been selected by the configuration file. A bit ugly, but necessary. There may be some more ugly bits for tracing. Have not really dug into the details.

Further, a wholesale redesign to accommodate a situation involving at its max less than 1% of the instructions does not seem justified. These few instructions are exception cases that should be dealt with as such. Now if you can provide a consistent place for these exceptions to be handled that would be ideal. The next time, just a few places need tweaking if the situation arises again. And of course, documented so you (if you forget, I know I do) or someone else can easily deal with this special case.

Just a few more thoughts.

Harold

Fish-Git commented 2 years ago

Normally, the function generates variations on the same instruction for the different architectures. However, there is nothing stopping the function for s/370 and z from having entirely different functionality.

Which is exactly the situation we're currently having to deal with.

Suspect the S/370 and z instructions have different instruction formats.

Nope. Both "S" format.

Usually in cases where the instruction is simply brought forward for backward compatibility, the instruction format is the same. In this case, and the other potential ones, the z instruction format may be different than the s/370 instruction.

Nope. Both "S" format. But yes, you do bring up a good point: there's nothing preventing a reused (repurposed?) opcode from using a different instruction format from what it was previously using. I rather suspect however, that such differences would likely be only very minor, such as the difference between, say, the various "SS" formats (SS-a, SS-b, ... SS-f). I suspect most instruction formats are determined by the actual opcode after all, so if the opcode is the same, the instruction format likely is too. (But I'll admit that doesn't necessarily have to always be true.)

Further, a wholesale redesign to accommodate a situation involving at its max less than 1% of the instructions does not seem justified.

At this point I would tend to agree. But my fear is that percentage might increase significantly over time, thereby forcing us to. And I'd hate another "Y2K"-type scramble from occurring in the future as a result of our "burying our head in the sand" now. I'd rather us lay the groundwork for the future today, rather than keep putting it off until tomorrow.

Now if you can provide a consistent place for these exceptions to be handled that would be ideal.

That's what I'm currently trying to figure out!

Unfortunately however, it's not going that well. :(

Fish-Git commented 2 years ago

FYI:

I've changed the title of this GitHub Issue to better reflect the actual core underlying problem. The original "forward-porting of channel-based I/O" issue as described in Harold's README.IOARCH document was simply a minor distraction causing a bit of confusion. The CORE of the problem that this GitHub Issue describes, and what we're currently discussing how best to deal with, is the z/Architecture reuse of older previously defined instruction opcodes for completely different purposes.

s390guy commented 2 years ago

On that point, I believe that this is probably (one NEVER knows the future for sure) an issue for privileged instructions of which these two new instructions are a part. The backward compatibility support will preclude reuse for general instructions.

Part of the problem here is the use of a single table for all three architectures. Is it worth creating a new table for each architecture? That is the general solution to the problem. But, it is really only z that would benefit from/needs a separate table.

Harold

Fish-Git commented 2 years ago

Is it worth creating a new table for each architecture?

That's the direction I'm leaning in right now, yes. In fact, I'm thinking of maybe making all of the master opcode tables wholly ARCH_DEP, which would of course mean abandoning our current GENx370x390x900 macro in favor of a new e.g. GENOP macro that only defines the one instruction function pointer for that build architecture. That way we could maybe do something like:

static INSTR_FUNC ARCH_DEP( gen_opcode_b2xx )[256][NUM_INSTR_TAB_PTRS] =
{
#if __GEN_ARCH == 900 // z/Arch

 /*B200*/ GENOP ( "LBEAR"    , S   , ASMFMT_S    , load_bear                  ),
 /*B201*/ GENOP ( "STBEAR"   , S   , ASMFMT_S    , store_bear                 ),

#else // 370 or 390

 /*B200*/ GENOP ( "CONCS"    , S   , ASMFMT_S    , connect_channel_set        ),
 /*B201*/ GENOP ( "DISCS"    , S   , ASMFMT_S    , disconnect_channel_set     ),

#endif

 /*B202*/ GENOP ( "STIDP"    , S   , ASMFMT_S    , store_cpu_id               ),
 /*B203*/ GENOP ( "STIDC"    , S   , ASMFMT_S    , store_channel_id           ),
 /*B204*/ GENOP ( "SCK"      , S   , ASMFMT_S    , set_clock                  ),
   ...

How does that look?

In any case, I'll continue playing around with various ideas until I eventually find what I hope is a somewhat elegant (easy to implement) solution.

Fish-Git commented 2 years ago

Nope. Both "S" format. But yes, you do bring up a good point: there's nothing preventing a reused (repurposed?) opcode from using a different instruction format from what it was previously using. I rather suspect however, that such differences would likely be only very minor, such as the difference between, say, the various "SS" formats (SS-a, SS-b, ... SS-f). I suspect most instruction formats are determined by the actual opcode after all, so if the opcode is the same, the instruction format likely is too. (But I'll admit that doesn't necessarily have to always be true.)

On that point, I wrote a program quite a while back that generates various PDF reports based on the processing of a simple copy & paste from one of the Principle of Operation's appendix B instructions tables that some might find handy/helpful:

wrljet commented 2 years ago

On that point, I wrote a program quite a while back that generates various PDF reports based on the processing of a simple copy & paste from one of the Principle of Operation's appendix B instructions tables that some might find handy/helpful:

Instruction Opcodes.pdf

Facility Instructions.pdf

Instruction Formats.pdf

Cool!

s390guy commented 2 years ago

scarfed those up right away! This will be helpful in getting ASMA current. Thank you! Thank you!

s390guy commented 2 years ago

Just a few comments: No need to include CONCS and DISCS in 390. Is there still a need for three entries per extended operation code (what I refer to as the addition 4 or 8 bits)? Would this change flow to the other extended operation code tables in the other groups (for example, B3xx) etc?

Obviously, changes to these tables changes how Hercules identifies which function to call to execute the instruction.

This would solve the issue of tracing.

Fish-Git commented 2 years ago

On that point, I wrote a program quite a while back that generates various PDF reports based on the processing of a simple copy & paste from one of the Principle of Operation's appendix B instructions tables that some might find handy/helpful:

Instruction Opcodes.pdf

Facility Instructions.pdf

Instruction Formats.pdf

And just to complete the series:

Fish-Git commented 2 years ago

No need to include CONCS and DISCS in 390.

Agreed. We already settled that issue earlier. But I figured I could deal with that issue later. Right now I'm just trying to design some type (any type!) of relatively simple and straightforward (i.e. "elegant") solution.

Right now in fact, I'll settle for any type of workable solution! Once I get something that works, I can try to re-work it to try and make it a bit more simple and elegant. But first things first as they say: get something -- anything -- that works!. Simplicity can come later.

Fish-Git commented 2 years ago

Okay, I think I have a workable solution. It builds cleanly and runs correctly.

Whether or not it's a simple and straightforward/elegant solution is an entirely different matter of course. ;-)

IMO it's actually not that bad considering what needed to be done in order to pull it off. It's certainly not as simple as I'd like it to be, but in all honesty I couldn't think of any easier way to do it. It's definitely not elegant that's for sure, but then it's certainly not an overly complicated "kludge" either. All in all I think it came out rather well! But then that's just my own personal opinion too. Others are of course free to feel differently. *`()`**

Here's the patch; thoughtful feedback appreciated:

BEAR.zip

The diff_hyperion-0_hyperion-1_Aug-02-2022 15-41.htm file is an EDP diff report so you can more easily see the changes I had to make. (I personally find visual diff tools easier to use than manually reviewing a patch/diff file)

The BEAR.txt file is a simple script I threw together to verify my changes accomplish their intended goal.

The patch should be applied to the current develop git branch. Note too that I have made a few commits in the past few days, so be sure to do a refresh (pull) before applying. Also note that *nix users may also want to do a dos2unix on the patch file before applying it too.

I'd prefer that all developers give it a try, but I'll live with just one (or even zero!) if necessary. <fingers crossed>

I'll wait a (couple of? few?) days to give you all a chance to try it and provide thoughtful feedback if so inclined before I go ahead and commit it.

That's it.

Thanks for listening.

*`()` I certainly hope you all like it too, but then I'll admit it's quite common for programmers to not care too much for other programmer's coding/style too, so feel free to nit-pick it if you feel you must. I hope you don't, but please don't let me stop you if you feel so inclined to do so! I value all** feedback, whether thoughful or not!

mcisho commented 2 years ago

Hi Fish, I tried your BEAR patch and VM/370 Community Edition, VM/ESA 2.4 and z/VM 6.1 all started and shutdown. I also tried ESA/390, z/OS and zLinux. Other than zLinux none of the system did anything productive, they simple came up and went down.

btw, I'm a *nix user and I didn't do a '''dos2unix''' before the apply, I guess git is line ending agnostic.

wrljet commented 2 years ago

Fish,

Only tried it out on one system so far, Debian 10 x86_64 with gcc 8.3.

make check fails with:

>>>>> line 77021: Storage compare mismatch.
                  Want: R:0000000000000220 00004000 00000000 00000000 00000000  "Double words two and three"
                  Got:  R:0000000000000220 00004000 00000000 40000000 00000000
>>>>> line 77037: Gpr 3 compare mismatch.
                  Want: 2
                  Got:  0000000000000003
>>>>> line 77039: Gpr 5 compare mismatch.
                  Want: 2
                  Got:  0000000000000003
Test "STFL and STFLE":  9 OK compares.  3 failures.

From allTests.out:

13:23:51 HHC01603I * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
13:23:51 HHC01603I *
13:23:51 HHC01603I * Start of test file ../tests/stfl.tst date 2022-08-03 13:17:23.143201523 -0400 in /home/bill/herctest2/hyperion/tests
13:23:51 HHC01603I *
13:23:51 HHC01603I *Testcase STFL and STFLE
13:23:51 HHC01603I *
13:23:51 HHC01603I * Because this Testcase uses STFLE it will fail on ESA hardware.
13:23:51 HHC01603I *
13:23:51 HHC01603I sysclear
13:23:51 HHC00811I Processor CP00: architecture mode ESA/390
13:23:51 HHC02311I sysclear completed
13:23:51 HHC01603I archmode    z/Arch
13:23:51 HHC00811I Processor CP00: architecture mode z/Arch
13:23:51 HHC02204I ARCHLVL        set to z/Arch
13:23:51 HHC01603I loadcore    ../tests/stfl.core
13:23:51 HHC02250I Loading file ../tests/stfl.core to location 0
13:23:51 HHC02249I Operation complete
13:23:51 HHC01603I *Program    6
13:23:51 HHC01603I defsym    FW1   F3F6FFFB    # Facilities 000-031
13:23:51 HHC01603I defsym    FW2   FCFDFC24    # Facilities 032-063
13:23:51 HHC01603I defsym    FW3   207C4000    # Facilities 064-095
13:23:51 HHC01603I defsym    FW4   00000000    # Facilities 096-127
13:23:51 HHC01603I defsym    FW5   00004000    # Facilities 128-159
13:23:51 HHC02336I Script 1: test: test starting
13:23:51 HHC02339I Script 1: test: duration limit: 0.200000 seconds
13:23:51 HHC02228I restart key pressed
13:23:51 HHC02333I Script 1: test: running...
13:23:51 HHC00801I Processor CP00: Specification exception code 0006 ilc 4
13:23:51 HHC02324I PSW=0000000000000000 00000000000002B4 INST=B2B00001     STFLE 1(0)                   store_facility_list_extended
13:23:51 HHC02326I R:0000000000000001:K:06=000000 00000000 00000000 00000000 00 ................
13:23:51 HHC02269I R0=0000000000000003 R1=0000000000000000 R2=0000000030000000 R3=0000000000000003
13:23:51 HHC02269I R4=0000000000000000 R5=0000000000000003 R6=0000000000000000 R7=0000000000000000
13:23:51 HHC02269I R8=0000000000000000 R9=0000000000000000 RA=0000000000000000 RB=0000000000000000
13:23:51 HHC02269I RC=0000000000000000 RD=0000000000000000 RE=0000000000000000 RF=0000000000000000
13:23:51 HHC02271I C0=0000000000000060 C1=0000000000000000 C2=0000000000000000 C3=0000000000000000
13:23:51 HHC02271I C4=0000000000000000 C5=0000000000000000 C6=0000000000000000 C7=0000000000000000
13:23:51 HHC02271I C8=0000000000000000 C9=0000000000000000 CA=0000000000000000 CB=0000000000000000
13:23:51 HHC02271I CC=0000000000000000 CD=0000000000000000 CE=00000000C2000000 CF=0000000000000000
13:23:51 HHC00809I Processor CP00: disabled wait state 0002000000000000 0000000000000000
13:23:51 HHC02334I Script 1: test: test ended
13:23:51 HHC02338I Script 1: test: actual duration: 0.000156 seconds
13:23:51 HHC01603I *Compare
13:23:51 HHC01603I r c8.4
13:23:51 HHC02290I A:0000000000000000  K:06
13:23:51 HHC02290I R:00000000000000C0                    F3F6FFFB                   36..
13:23:51 HHC01603I *Want "Basic STFL." F3F6FFFB
13:23:51 HHC01603I r 200.10
13:23:51 HHC02290I A:0000000000000000  K:06
13:23:51 HHC02290I R:0000000000000200  F3F6FFFB FCFDFC24 00000000 00000000  36..............
13:23:51 HHC01603I *Want "First two doublewords of truncated facilities list" F3F6FFFB FCFDFC24 00000000 00000000
13:23:51 HHC01603I r 210.10
13:23:51 HHC02290I A:0000000000000000  K:06
13:23:51 HHC02290I R:0000000000000210  F3F6FFFB FCFDFC24 207C4000 00000000  36.......@ .....
13:23:51 HHC01603I *Want "First two doublewords of extended facilities list"  F3F6FFFB FCFDFC24 207C4000 00000000
13:23:51 HHC01603I r 220.10
13:23:51 HHC02290I A:0000000000000000  K:06
13:23:51 HHC02290I R:0000000000000220  00004000 00000000 40000000 00000000  .. ..... .......
13:23:51 HHC01603I *Want "Double words two and three" 00004000 00000000 00000000 00000000
13:23:51 HHC01603I r 230.10
13:23:51 HHC02290I A:0000000000000000  K:06
13:23:51 HHC02290I R:0000000000000230  00000000 00000000 00000000 00000000  ................
13:23:51 HHC01603I *Want "Double words four and five" 00000000 00000000 00000000 00000000
13:23:51 HHC01603I r 240.10
13:23:51 HHC02290I A:0000000000000000  K:06
13:23:51 HHC02290I R:0000000000000240  00000000 00000000 00000000 00000000  ................
13:23:51 HHC01603I *Want "Double words six and seven" 00000000 00000000 00000000 00000000
13:23:51 HHC01603I gpr
13:23:51 HHC02269I General purpose registers
13:23:51 HHC02269I R0=0000000000000003 R1=0000000000000000 R2=0000000030000000 R3=0000000000000003
13:23:51 HHC02269I R4=0000000000000000 R5=0000000000000003 R6=0000000000000000 R7=0000000000000000
13:23:51 HHC02269I R8=0000000000000000 R9=0000000000000000 RA=0000000000000000 RB=0000000000000000
13:23:51 HHC02269I RC=0000000000000000 RD=0000000000000000 RE=0000000000000000 RF=0000000000000000
13:23:51 HHC01603I *Gpr 2 30000000
13:23:51 HHC01603I *Gpr 3 2
13:23:51 HHC01603I *Gpr 4 0
13:23:51 HHC01603I *Gpr 5 2
13:23:51 HHC01603I *Done

Bill

Fish-Git commented 2 years ago

make check fails with:

Test "STFL and STFLE": 9 OK compares. 3 failures.

Yes, I know. I forgot to update the stfl.tst test to account for the new facility bit that is now enabled by default (193 = BEAR Enhancement). Sorry!

Rather than upload a whole new BEAR patch, here's the patch for just the tests/stfl.tst file instead:

stfl.tst.zip

--- hyperion-1/tests/stfl.tst   2022-01-06 16:29:25.841818300 -0800
+++ hyperion-0/tests/stfl.tst   2022-08-03 11:48:13.402707000 -0700
@@ -11,6 +11,8 @@
 # by John P. Hartmann.  You can use it for anything you like,
 # as long as this notice remains.
 #----------------------------------------------------------------------
+# Minor cosmetic updates (e.g. use 'defsym's, etc) by Fish
+#----------------------------------------------------------------------

 sysclear
 archmode    z/Arch
@@ -24,9 +26,15 @@
 #----------------------------------------------------------------------
 defsym    FW1   F3F6FFFB    # Facilities 000-031
 defsym    FW2   FCFDFC24    # Facilities 032-063
+
 defsym    FW3   207C4000    # Facilities 064-095
 defsym    FW4   00000000    # Facilities 096-127
+
 defsym    FW5   00004000    # Facilities 128-159
+defsym    FW6   00000000    # Facilities 160-191
+
+defsym    FW7   40000000    # Facilities 192-223
+defsym    FW8   00000000    # Facilities 224-255
 #----------------------------------------------------------------------

 runtest 0.1
@@ -43,7 +51,7 @@
 *Want "First two doublewords of extended facilities list"  $(FW1) $(FW2) $(FW3) $(FW4)

 r 220.10
-*Want "Double words two and three" $(FW5) 00000000 00000000 00000000
+*Want "Double words two and three" $(FW5) $(FW6) $(FW7) $(FW8)

 r 230.10
 *Want "Double words four and five" 00000000 00000000 00000000 00000000
@@ -53,14 +61,19 @@

 gpr
 *Gpr 2 30000000
-*Gpr 3 2
+*Gpr 3 3        # (adjust if needed when new facilities added)
 *Gpr 4 0
-*Gpr 5 2
+*Gpr 5 3        # (adjust if needed when new facilities added)

 *Done

 #----------------------------------------------------------------------
+#
 #                         (FOR REFERENCE)
+#
+#             NOTE: the below is no longer current
+#            as it hasn't been updated in a long while
+#
 #----------------------------------------------------------------------
 #
 # BC12 2828: z/VM 5.4 -- returns 2 doublewords

Fish-Git commented 2 years ago

I forgot to update the stfl.tst test to account for the new facility bit that is now enabled by default

Since we're on the subject...

I question our actual need for this particular test. It's a stupid test IMO. All it does is verify that both the STFL and STFLE instructions both work correctly (which we've already proven countless times already!), and each are so dirt simple and unlikely to ever be changed or enhanced by IBM, that IMHO it really shouldn't be part of our runtest suite at all. It's the kind of test you only ever need to run once when an instruction is first written to prove it works right, but then never need to run again. It's like testing a Load/Store Register instruction after every build for crying out loud!

And given that it's so easy for people (myself included/especially!) whenever support for a new facility is added (thereby causing, just like it did this time, for make check to fail), it seems silly to me for it to remain part of our runtest suite of tests.

IMHO we should get rid of it entirely. It's a silly/stupid test to begin with and only ends up getting in the way.

What does everyone else think?

mcisho commented 2 years ago

While I would agree that testing the contents of the returned facility bits is fairly pointless, I do not agree with jettisoning the tests, particularly for STFLE. The outputs from STFLE, i.e. the condition code and the number of doublewords returned, depend on the input to STFLE, which imho should be still be tested.

Fish-Git commented 2 years ago

Hmmm... Okay, that makes sense I guess. Remove just the testing of the bit values but leave everything else as-is (i.e. keep the Specification Exception test in the program and the registers tests in the script). Thanks Ian.

wrljet commented 2 years ago

While I would agree that testing the contents of the returned facility bits is fairly pointless, I do not agree with jettisoning the tests, particularly for STFLE. The outputs from STFLE, i.e. the condition code and the number of doublewords returned, depend on the input to STFLE, which imho should be still be tested.

I agree with that.

wrljet commented 2 years ago

I've tested a small portion of an MVS 3.8 SYSGEN with S/370 arch, and it worked.

Fish-Git commented 2 years ago

I agree with that.

Aye, as do I too.

So it's settled then. I'll go ahead and do that.

I'll probably commit ALL of my changes (stfl.tst included) some time tomorrow then.

Thanks guys.

Fish-Git commented 2 years ago

mcisho said:

btw, I'm a *nix user and I didn't do a '''dos2unix''' before the apply, I guess git is line ending agnostic.

I have no idea.

What I do know is the git client I myself am using on Windows is not: you have to configure it properly or else you end up messing things up:

I use TortoiseGit, which uses Git for Windows under the covers, and a choice needs to be made when you install or upgrade Git for Windows:

autocrlf

And the choice you make interacts with your repository's .gitattributes settings and impacts how files thus both appear in your working tree as well as how they're committed to the repository.

OR... it might just be the version/flavor of the command line diff and patch tools I'm using, which might be Windows ports of the *nix versions.

OR... it might be the Windows batch file script I wrote that creates the diff/patch files, which as I recall always does a u2d to ensure my patch files are always created with Windows CRLF line endings.

I'm not actually sure which it is. All I do know is if the line endings on my .diff (.patch) files aren't right, then patch.exe messes up and the file(s) end up with mixed line endings, which screws up the repository royally if I then commit it.

So I was just trying being cautious (nice) by mentioning it to you guys, that's all. No harm no foul.

mcisho commented 2 years ago

I suppose the zArch related facility bits might also need to be tested, depending on the arch mode existing when the instruction is issued?

ivan-w commented 2 years ago

I suppose the zArch related facility bits might also need to be tested, depending on the arch mode existing when the instruction is issued?

Remember...

Because a bit is not set doesn't mean it is not supported (it just means you should not rely on it).. If a certain facility bit is not set does NOT mean that an instruction implementing the facility will always throw an operation exception.

It only works the other way around - If the bit is set it means it is guaranteed to work as advertised !

mcisho commented 2 years ago

True enough, but I wasn't really thinking in terms of facility related instructions. I was thinking that if Hercules is running in zArch mode it should have facility bits 1 and 2 turned on, but if its not running in zArch mode it should have facility bit 2 turned off.

ivan-w commented 2 years ago

What? That's not the case????

Of course that should be it. If it isn't, it's a bug! (and I mean a BIG one!)

Now, just because z/Arch is not in effect doesn't mean that z/Arch-only facilities should not be set to 1 when the system is not running in z/Arch mode. That's different! Actually they should be set to 1!

A bit being set as a facility just indicates it it's available on the system. However, if it is only available in z/Arch mode (by specification) and the system is not in z/Arch mode, then the behavior is undefined when attempting to use that facility. (basically: anything goes!)

Like:

Q: Can you do a somersault when sober? A: Sure Q: Are you drunk A1: Yes / A2 : No Q: Do a somersault A1: Sometimes it does, sometimes it doesn't! / A2 : Does somersault

mcisho commented 2 years ago

No, I'm not suggesting Hercules isn't setting the bits correctly, I'm just saying that if we have a test that tests the STFL(E) instructions perhaps they should test the mode related bits.

ivan-w commented 2 years ago

No: STFLE doesn't set bits regarding the mode it is running currently (except for bit 2). Only for what the system is capable of.

mcisho commented 2 years ago

Err... Isn't that what I said earlier, when I mentioned bits 1 and 2?

ivan-w commented 2 years ago

Yes. Bit 2 is the ONLY bit that needs to switch depending on the architecture mode in which the system is running. The other bits should reflect exactly (or as instructed) the facilities available regardless of the current operating mode. (Yeah and bit 1 too).

mcisho commented 2 years ago

Good, I think we agree. I obviously should have been more explicit in my suggestion a couple of hours ago.

ivan-w commented 2 years ago

We figured it out, eventually!

s390guy commented 2 years ago

Sorry to be late to this party...

I respectfully disagree with jettisoning the test of reported facility bits. Yes, it might be a "pain". But all regression tests fall into that category. All regression tests should succeed if everything is correct. The test did what it was designed to do. The whole purpose of the regression test is to validate proper functionality. Certainly, the test might need adjustment on occasion.

And in the STFL(E) cases, we might argue about which bits should be tested. The facility bit tests should remain, IMO.

Harold Grovesteen

Fish-Git commented 2 years ago

Harold wrote:

... And in the STFL(E) cases, we might argue about which bits should be tested. The facility bit tests should remain, IMO.

Hmmm... So you are suggesting our test should be modified (enhanced) to not do a straight comparison of ALL facility bits, but rather just verify that a specific subset of the facility bits are set correctly instead? Yes? (only bits 1 and 2 perhaps?)

The test did what it was designed to do. The whole purpose of the regression test is to validate proper functionality.

I beg to differ. The test failed not because Hercules wasn't functioning correctly, but rather because the test itself was bad. Thus the test was not actually validating proper functionality. Hercules was functioning properly. It was the test that was buggy, not Hercules. That's a horse of a different color.

Certainly, the test might need adjustment on occasion.

Which is exactly what I am trying to avoid as it is antithetical to how our tests are supposed to work. Tests, once written, should not need to be "adjusted" (unless they are found to contain a bug that fails to catch the flaw/flaws they were meant to catch of course). Tests IMO should be fixed in stone and, if they are written correctly, should never need adjusting.

So IMO our existing test is flawed in its very design, in that it is constantly "breaking" and thus constantly needing to be "fixed".

And I'm at a complete loss as to how to fix it permanently so that it doesn't constantly break each time support for a new facility is coded. (Except maybe to do what you suggested and only test a few very specific bits in, say, the first word?).

s390guy commented 2 years ago

As I read the test, it was looking for a specific set of facility bits as supported by a specific product implementation. Differing beliefs as to what should be tested are possible. And I think that is where I am headed.

The availability by Hercules of a capability it supports must reflect that availability within the bits when an architecture defines such bit(s). Validation of that setting is required. Hence the test.

And yes, as new facilities are supported, the test will change. There is no way around that. Unlike instruction related tests that should just work after being implemented, the facilities test will require change until it becomes clairvoyant.

However, a different test more suited to Hercules specific requirements is another topic entirely. Facility bits is an ongoing mess. How to make it less of a mess is a different topic than the one to which I was responding which was testing just 2 bits. At least how I understood the discussion.

Fish-Git commented 2 years ago

As I read the test, it was looking for a specific set of facility bits as supported by a specific product implementation.

Which leads to the question, which "specific set of bits" should the test be testing? ALL of the bits that STFLE is capable of returning? (which necessarily varies over time as support for new facilities is added to Hercules?) Or only some bits? (i.e. just a specific subset of bits, such as just bits 0, 1 and 2? Or maybe just the first 32 bits? etc.)

The availability by Hercules of a capability it supports must reflect that availability within the bits when an architecture defines such bit(s). Validation of that setting is required. Hence the test.

Yes of course. I agree. But my point is we can prove Hercules does indeed return the proper bit setting for a given facility or facilities WITHOUT having to test that capability for all of the bits!

For example, facility bit 9: the Sense-Running-Status Facility. We support that, so the bit should normally be on by default, and we can easily verify proper Hercules functionality by disabling that specific facility in the script (facility disable 009_SENSE_RUN_STATUS command) and verifying the bit is now off. That IMHO should be good enough to prove that Hercules is functioning properly as far as returning the proper facility bit settings for facilities (since the logic that handles the setting/clearing of facility bits is identical regardless of which bit you specify).

So why bother checking that functionality for all bits? That's like verifying an instruction sets a proper greater-than or less-than condition code for all possible values! That's nuts! Isn't comparing just one specific pair of values good enough to confirm the instruction is setting the proper condition code? Isn't checking one STFLE bit good enough to verify STFL/STFLE is indeed functioning properly?

And yes, as new facilities are supported, the test will change. There is no way around that.

I disagree! If we're going to keep the test (and based on feedback, we are going to keep it!), then why can't we fix it so that it doesn't keep breaking? Why can't we change it so that it does indeed verify that STFL/STFLE are indeed working properly without having to test the full set of facility bits? Why can't we do that?

That would relieve us from having to be constantly updating ("fixing") the test each time a new facility was added to Hercules. There's no need IMHO to verify Hercules will indeed return the proper bit setting for that new facility given that we've already confirmed a million times over that it will do so for any given facility! Don't you see?

Facility bits is an ongoing mess.

Oh?!! How so?

How to make it less of a mess is a different topic than the one to which I was responding which was testing just 2 bits.

So you are arguing the test should only be verifying bits 1 and 2? Yes? Is that your position now? Because that's at least a step in the right direction IMO. Verifying just those two bits (but not any others would certainly resolve the problem too, so we're definitely making progress in our discussion I think.

How do the rest of you feel? Keep the current design, thereby requiring us to be constantly "fixing" the test each time support for a new facility is added? Or fix the damn test once and for all by changing it to only check one (two? three?) specific bit(s)? (since doing so thus proves that Hercules will indeed set any bit properly)

s390guy commented 2 years ago

Facility bits is an ongoing mess.

Oh?!! How so?"

It is mess due to the impact the bits continually have on users and those trying to use various operating systems on Hercules. Not a Hercules problem per se, but one users seem to continuously run into. And I am not suggesting that this test is the way to make that better. I do not think so.

Why can't we change it so that it does indeed verify that STFL/STFLE are indeed working properly without having to test the full set of facility bits? Why can't we do that?

And there is the crux of the discussion. Is this a test of the STFL/STFLE instructions, or a test of the instructions including the results of those instructions. Let us take as an example a test of A. Not only is the instruction tested but also the results of that instruction. In my view a test of STFL/STFLE includes the RESULTS of stored bits. To not test the bits is only a partial test of the instruction functionality. So, no I have not changed my view about bit testing.

since the logic that handles the setting/clearing of facility bits is identical regardless of which bit you specify

That is a developer's view of QA testing. I did some of that for a while, so I know. The purpose of QA testing is to find changes in functionality that might be introduced by any change. You are unlikely to cause such an issue. Others, who knows? And that is why you have QA testing of all results.

I am not at all suggesting that the right thing to do is keep the test we have. But these are my views on a perhaps changed or new test.

If everyone thinks a dumb'd down test is the way to go, well so be it. You asked for opinions and you have mine.

Harold

wrljet commented 2 years ago

Tested and working (S/370 MVS SYSGEN) on macOS 13.0 Ventura (beta) on x86-64 VMware.

Fish-Git commented 2 years ago

Closing.

SDL-Hercules-390 / hyperion

PROBLEM: z/Arch reuse of S/370 opcodes for entirely new purpose #493

FYI: