SDL-Hercules-390 / hyperion

The SDL Hercules 4.x Hyperion version of the System/370, ESA/390, and z/Architecture Emulator
Other
240 stars 90 forks source link

Vector Facility for z/Architecture #650

Open Fish-Git opened 5 months ago

Fish-Git commented 5 months ago

This issue was created for discussing development of the z/Architecture Vector Facility.

All discussion regarding this effort should take place HERE, in THIS GitHub Issue, and not in Issue #77, which is a generic GitHub Issue regarding all yet-to-be-developed z/Architecture facilities.

Please refrain from discussing z/Architecture Vector Facility development anywhere else, and discuss it here instead.

Thank you.

salva-rczero commented 4 months ago

There were lots of warnings, such as:

../zvector.c: In function ‘z900_vector_load’:
../zvector.c:184:17: warning: variable ‘m3’ set but not used [-Wunused-but-set-variable]
  184 |     int     v1, m3, x2, b2;
      |                 ^~

If you attach the complete list, I'm sure we could get them fixed.

Most of the unused warning are due to pending implementation:

//
// TODO: insert code here
//
if (1) ARCH_DEP( program_interrupt )( regs, PGM_OPERATION_EXCEPTION );

So, "pragma ignore" may be a good temporary solution.

but in a few of them, vector load, vectore store... m3/m4 are alignment hint for real mainframe hardware. Which I believe should not affect Hercules.

I tried:

#if defined(__GNUC__)
    int m3 __attribute__((unused)); // Alignment hint
#else
   int m3;
#endif

and it works. Not sure if too ugly.

Another option, may be to write extra flavors for DECODERS macros. Uglier?

I'll apreciate your comments.

Fish-Git commented 4 months ago

Most of the unused warning are due to pending implementation

Quite right. So unless they really bother you, I suggest just ignoring them for now. They should all go away once implementation is complete.

So, "pragma ignore" may be a good temporary solution.

Agreed.  With emphasis on temporary.

I tried:

#if defined(__GNUC__)
    int m3 __attribute__((unused)); // Alignment hint
#else
   int m3;
#endif

and it works. Not sure if too ugly.

Too ugly.

A better solution would be to simply use our existing UNREFERENCED(x) macro.

Another option, may be to write extra flavors for DECODERS macros.

Oh HELL no!

Uglier?

Definitely!

I'll appreciate your comments.

You have them.  :)

mcisho commented 4 months ago

@salva-rczero Well done, you're making good progress. Linux used to panic about VL and VA instructions, now it's moved on to VSLDB (which interestingly it recovers from) and VERLL instructions.

Now that the FP changes appear to be done, I'm volunteering to help with some of the 80-odd instruction still to be completed.

salva-rczero commented 4 months ago

@mcisho What's your testing environment? I tired with linux x86 and it works until it reaches a PGM_OPERATION_EXCEPTION for a nonimplemented instructions.

VSLDB must be working, VERLL should throws a PGM_OPERATION_EXCEPTION.

JamesWekel commented 4 months ago

wrljet Which of "yous guy" have Raspberry Pi(s) or Macs?

I have a Raspberry PI 4, a Raspberry PI 5 and Intel Nuc I5 all running "Armbian 24.2.5 jammy" Ubuntu.

Jim

mcisho commented 4 months ago

@salva-rczero My host is Fedora 40 x86_64 and the guest was Fedora 36 s390x. As you say, it works until it reaches a non implemented instruction, but that non implemented instruction takes longer to get to. However, having looked a little more closely at the log VSLDB appears not to be working, whereas VERLL does.

[    2.229015] Linux version 6.2.15-100.fc36.s390x (mockbuild@buildvm-s390x-18.s390.fedoraproject.org) (gcc (GCC) 12.2.1 20221121 (Red Hat 12.2.1-4), GNU ld version 2.37-37.fc36) #1 SMP Thu May 11 15:47:55 UTC 2023
[    2.229046] setup: Linux is running natively in 64-bit mode
  ::
[    3.830476] Key type asymmetric registered
[    3.830510] Asymmetric key parser 'x509' registered
HHC00801I Processor CP00: Operation exception interruption code 0001 ilc 6
HHC02324I PSW=0704E00180000000 0000000024636FF0 INST=E72220080077 VSLDB 2,2,2,8,0              vector_shift_left_double_by_byte
HHC02326I V:0000000000FA8008:R:0000000000FA8008:K:06=00000000 00000000 00000000 00000000  ................
HHC02326I V:0000000000000077:R:0000000000000077:K:06=00 00000000 00000000 00602000 000010 ..........-.....
[    5.997707] Freeing initrd memory: 17292K
[    6.061102] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 247)
[    6.061509] io scheduler mq-deadline registered
[    6.061546] io scheduler kyber registered
[    6.061890] io scheduler bfq registered
[    6.086454] illegal operation: 0001 ilc:3 [#1] SMP
[    6.086523] Modules linked in:
[    6.086561] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.2.15-100.fc36.s390x #1
[    6.086614] Hardware name: HRC 2817 EMULATOR EMULATOR (LPAR)
[    6.086649] Krnl PSW : 0704e00180000000 0000000024636ff6 (chacha20_vx+0x296/0x820)
[    6.086737]            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
[    6.086811] Krnl GPRS: 0000037f0000000a ffffffffffffff60 0000000000fa8000 000000002600b0a6
[    6.086861]            0000000000000109 0000037fffb1bc68 0000037fffb1bc88 0000000025c71780
[    6.086909]            0000037fffb1bc68 000000002600b0a6 0000000000fa8000 0000000000000109
[    6.086955]            00000000006bc200 0000000000000109 00000000246364c4 0000037fffb1b728
[    6.087058] Krnl Code: 0000000024636fe4: e71100072c33        verll   %v17,%v17,7,2
[    6.087058]            0000000024636fea: e75500072c33        verll   %v21,%v21,7,2
[    6.087058]           #0000000024636ff0: e72220080077        vsldb   %v2,%v2,%v2,8
[    6.087058]           >0000000024636ff6: e76660080077        vsldb   %v6,%v6,%v6,8
[    6.087058]            0000000024636ffc: e7aaa0080077        vsldb   %v10,%v10,%v10,8
[    6.087058]            0000000024637002: e7eee0080077        vsldb   %v14,%v14,%v14,8
[    6.087058]            0000000024637008: e72220080e77        vsldb   %v18,%v18,%v18,8
[    6.087058]            000000002463700e: e76660080e77        vsldb   %v22,%v22,%v22,8
[    6.087508] Call Trace:
[    6.087575]  f6>] chacha20_vx+0x296/0x820
[    6.087634] Last Breaking-Event-Address:
[    6.087663]  e>] chacha20_crypt_s390.constprop.0+0x6e/0xe0
[    6.087724] ---[ end trace 0000000000000000 ]---
[    6.087764] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
salva-rczero commented 4 months ago

@mcisho Ok, VSLDB throws an OPERATION_EXCEPTION. Same to me:

17:27:36 HHC00801I Processor CP00: Operation exception interruption code 0001 ilc 6
17:27:36 HHC02324I PSW=0000000180000000 0000000000000322 INST=E72220080077 VSLDB 2,2,2,8,0              vector_shift_left_double_by_byte
17:27:36 HHC02326I R:0000000000000008:K:06=00000000 00000000 00000000 00000000  ................
17:27:36 HHC02326I R:0000000000000077:K:06=30 000A0000 00000038 00000000 000000 ................
17:27:36 HHC02269I R0=0000000000000000 R1=0000000000000000 R2=0000000000000000 R3=0000000000000000
17:27:36 HHC02269I R4=0000000000000000 R5=0000000000000000 R6=0000000000000000 R7=0000000000000000
17:27:36 HHC02269I R8=0000000000000000 R9=0000000000000000 RA=0000000000000000 RB=0000000000000000
17:27:36 HHC02269I RC=0000000000000200 RD=0000000000001200 RE=0000000000000000 RF=0000000000000000
17:27:36 HHC02266I VR00=0123456789abcdef.fedcba9876543210 VR01=5555555555555555.5555555555555555
17:27:36 HHC02266I VR02=5555555555555555.5555555555555555 VR03=0000000000000000.0000000000000000
...

but not the Kernel Panic. What is your testing case?

mcisho commented 4 months ago

I have various Linuxes to drive different Hercules network interfaces, and none of the Linuxes use vector instructions. I updated one of them the other day to a kernel that does use vector instructions, just to see what happens. So, not a test case exactly, simply an interest.

salva-rczero commented 4 months ago

@mcisho ok, please update to lastest version.

On the other hand, the zVector facility includes 21 instructions with floating point functionality:

Vector FP Test Data Class Immediate Vector FP Multiply and Subtract Vector FP Multiply and Add Vector FP Convert to Logical 64-bit Vector FP Convert from Logical 64-bit Vector FP Convert to Fixed 64-bit Vector FP Convert from Fixed 64-bit Vector FP Load Lengthened Vector FP Load Rounded Vector Load FP Integer Vector FP Compare and Signal Scalar Vector FP Compare Scalar Vector FP Perform Sign Operation Vector FP Square Root Vector FP Subtract Vector FP Add Vector FP Divide Vector FP Multiply Vector FP Compare Equal Vector FP Compare High or Equal Vector FP Compare High

I started with the first one "Vector FP Test Data Class Immediate", but I can't use float128_t, get_sbfp, float32_class, get_float32... functions from ieee.c in zvector.c. Not sure if this the good approach. I am very new to how hercules works with floats.

Can you help me?

salva-rczero commented 4 months ago

Current status

mcisho commented 3 months ago

@salva-rczero I don't think I know much more than you regarding how Hercules works with floats. Changing how the register values were accessed didn't give me much insight into using SoftFloat. However, I'll start looking. Don't expect speedy results, I won't be able to do much this month.

Fish-Git commented 3 months ago

Ian and Salva,

Does it really matter how Hercules's current floating point logic works? All of our current Quality Assurance (runtest tests) for all of our existing non-vector floating point instructions all pass, yes?

I was under the impression that part of the new Vector Facility design was to update our existing floating point registers whenever a corresponding vector register was updated, and vice versa. Yes? That is to say, the only important thing is how the registers in hstructs.h are accessed, yes? And that was fixed (changed) several commits ago, yes? (i.e. the shared registers design: hstructs.h fpr was replaced with vfp instead, such that both normal floating point instructions AND vector instructions now both access the same internal registers storage).

And as I said, all existing floating point instructions are working perfectly! Right?

So what's the problem? What's the concern here? What am I missing?

salva-rczero commented 3 months ago

@Fish-Git: FP & vector are working well in the sharedvfp branch.

But, there are 21 new instructions that are vector and floating point (most BFP) at the same time. My concern is how to reuse FP functionality (load, arithmetics opers, format checks...) currently in ieee.c(?) in zvector.c.

Any ideas are welcome!

Thanks for your support.

Fish-Git commented 3 months ago

But, there are 21 new instructions that are vector and floating point (most BFP) at the same time. My concern is how to reuse FP functionality (load, arithmetics opers, format checks...) currently in ieee.c(?) in zvector.c.

Ah! Yes. I understand now.

Hmmmm...

Any ideas are welcome!

If you could identify what code (i.e. what functionality) that you need, then we might be able to tell you which existing floating point functions you need to call.

I would suggest coding something like:

/*-------------------------------------------------------------------*/
/* E7xx VXXX   - Vector whatever...                            [Vxx] */
/*-------------------------------------------------------------------*/
DEF_INST( vector_whatever )
{
    ... VFP stuff...

    /* Call BFP helper function to do whatever... */
    new_or_existing_ieee_bfp_function_to_do_something( ... variables to be passed and returned ... );

    ... continue with VFP stuff...
}

and then documenting the requirements for each of the needed functions (i.e. what they should do, the variables that should be passed to it, the values that should be returned, etc). That is to say, just make up some descriptive name and then define a dummy version of it somewhere. (Maybe with a few simple comments explaining what the function is supposed to accomplish.)

Then hopefully we (one of us, i.e. either Ian or myself or someone else) will hopefully be able to identify which existing FP/BFP helper function you need to call, and/or which new FP/BFP helper function we will need to create for you.

Does that make sense?

salva-rczero commented 3 months ago

@Fish-Git: Absolutely!

E74A = VFTCI (vector_fp_test_data_class_immediate) checks if BFP data in a vector register pass one or more floating point condition (infinity positive, subnormal number, NaN...).

This is exactly the same checks done by float32_class, float64_class and float128_class functions in ieee.c.

I need to call this (or equivalent) functions from zvector.c.

I also need to be able to use/convert float32_t, float64_t & float128_t types too.

Thanks again.

Fish-Git commented 3 months ago

Well, the following patch illustrates a VERY Quick and VERY Dirty way to accomplish it, by simply making zvector.c an integral part of ieee.c itself:

--- hyperion-vect-1/ieee.c  2024-04-29 23:15:42.230119500 -0700
+++ hyperion-vect-0/ieee.c  2024-05-11 17:41:00.407165100 -0700
@@ -5587,6 +5587,17 @@

 #endif /* defined( FEATURE_BINARY_FLOATING_POINT ) */

+// PROGRAMMING NOTE: the following essentially makes source file
+// "zvector.c" an integral part of "ieee.c" (i.e. of ourselves),
+// which allows "zvector.c" to more conveniently directly access
+// any function or type or constant, etc, defined within ourself
+// since it ("zvector.c") is essentially just a part of ourself.
+
+#undef  INCLUDING_FROM_IEEE_C
+#define INCLUDING_FROM_IEEE_C
+#include "zvector.c"
+#undef  INCLUDING_FROM_IEEE_C
+
 /*-------------------------------------------------------------------*/
 /*          (delineates ARCH_DEP from non-arch_dep)                  */
 /*-------------------------------------------------------------------*/
--- hyperion-vect-1/zvector.c   2024-05-09 18:57:31.000483500 -0700
+++ hyperion-vect-0/zvector.c   2024-05-11 17:42:04.367277400 -0700
@@ -1,5 +1,6 @@
 /* ZVECTOR.C    (C) Copyright Jan Jaeger, 1999-2012                  */
 /*              (C) Copyright Roger Bowler, 1999-2012                */
+/*              (C) Copyright Salva rczero(?), 2024                  */
 /*              z/Arch Vector Operations                             */
 /*                                                                   */
 /*   Released under "The Q Public License Version 1"                 */
@@ -9,13 +10,14 @@
 /* Interpretive Execution - (C) Copyright Jan Jaeger, 1999-2012      */
 /* z/Architecture support - (C) Copyright Jan Jaeger, 1999-2012      */

-#include "hstdinc.h"
-#define _ZVECTOR_C_
-#define _HENGINE_DLL_
+// PROGRAMMING NOTE: the following essentially makes ourselves
+// ("zvector.c") an integral part of "ieee.c", allowing ourselves
+// ("zvector.c") to more conveniently directly access any function
+// or type or constant, etc, defined in "ieee.c" (since we are
+// essentially a part of it).

-#include "hercules.h"
-#include "opcode.h"
-#include "inline.h"
+#include "hstdinc.h"
+#if defined( INCLUDING_FROM_IEEE_C )

 #if defined( FEATURE_129_ZVECTOR_FACILITY )
 /*-------------------------------------------------------------------*/
@@ -977,6 +979,21 @@
     //
     // TODO: insert code here
     //
+
+
+// example call to float128_class function defined in ieee.c
+    {
+        float128_t  op1;
+        U32         float_class;
+
+        GET_FLOAT128_OP( op1, v1, regs );
+
+        float_class = float128_class( op1 );
+    }
+
+
+
+
     if (1) ARCH_DEP( program_interrupt )( regs, PGM_OPERATION_EXCEPTION );
     //
     ZVECTOR_END( regs );
@@ -3493,17 +3510,4 @@

 #endif /* defined( FEATURE_129_ZVECTOR_FACILITY ) */

-#if !defined( _GEN_ARCH )
-
-  #if defined(              _ARCH_NUM_1 )
-    #define   _GEN_ARCH     _ARCH_NUM_1
-    #include "zvector.c"
-  #endif
-
-  #if defined(              _ARCH_NUM_2 )
-    #undef    _GEN_ARCH
-    #define   _GEN_ARCH     _ARCH_NUM_2
-    #include "zvector.c"
-  #endif
-
-#endif /*!defined(_GEN_ARCH)*/
+#endif // defined( INCLUDING_FROM_IEEE_C )

Granted, it's ugly as hell, but hey, it works!  :)

salva-rczero commented 3 months ago

I was afraid of that. Given my poor understanding of the include structure in Hercules, I would prefer another approach.

How about splitting ieee.c into ieee.h+ieee.c and exposing the necessary types, macros and functions declarartions, and then include ieee.h from zvector.c ?

mcisho commented 3 months ago

Alternatively, the vector fp instructions could be moved from zvector.c to ieee.c? Might be simpler than splitting ieee.c?

As an aside the following instructions probably should have the _64bit removed from their function names.

DEF_INST( vector_fp_convert_to_logical_64_bit )
DEF_INST( vector_fp_convert_from_logical_64_bit )
DEF_INST( vector_fp_convert_to_fixed_64_bit )
DEF_INST( vector_fp_convert_from_fixed_64_bit )   

Vector-enhancements facility 1 and 2 introduce new versions of the instructions that use 32-bit vectors.

Addendum

I've tried moving the vector fp instructions to ieee.c, and it seems to work, the various types and functions are usable.

Fish-Git commented 3 months ago

How about splitting ieee.c into ieee.h+ieee.c and exposing the necessary types, macros and functions declarartions, and then include ieee.h from zvector.c ?

Yes, that IS the correct way.

Alternatively, the vector fp instructions could be moved from zvector.c to ieee.c? Might be simpler than splitting ieee.c?

That would work too.

Fish-Git commented 3 months ago

As an aside the following instructions probably should have the _64bit removed from their function names.

Agreed.

mcisho commented 3 months ago

I have attached my changes to ieee.c and zvector.c so that you can see what I have done so far, and discuss/decide whether we should continue on this path? The changes to ieee.c add the vector fp instructions and implement some of them, the changes to zvector.c remove the vector fp instructions and add some comments re where they can be found.

Fish-Git commented 3 months ago

QUICK QUESTION:

Is the sharedvfp branch obsolete now? That is to say, is all current VFP development now being done in the normal develop branch now? Is the sharedvfp branch "finished"? Has the reason (purpose) for its creation been completed now? I just need some clarity on this. Thanks!

Fish-Git commented 3 months ago

I have attached my changes to ieee.c and zvector.c so that you can see what I have done so far, and discuss/decide whether we should continue on this path?

Looks okay to me, Ian! And IMO yes, it seems to be a valid working path that we should probably continue on. I'm thinking the bulk of the Vector instructions should of course continue to be in zvector.c, with the few exceptions to the rule that deal with floating point moved into ieee.c just like you have them in your example .zip.

mcisho commented 3 months ago

Is the sharedvfp branch obsolete now?

No. The develop branch doesn't have zVector support. If you want to try zVector you need to use the sharedvfp branch, and the latest commit of progress by @salva-rczero was to the sharedvfp branch.

salva-rczero commented 3 months ago

QUICK QUESTION:

Is the sharedvfp branch obsolete now? That is to say, is all current VFP development now being done in the normal develop branch now? Is the sharedvfp branch "finished"? Has the reason (purpose) for its creation been completed now? I just need some clarity on this. Thanks!

For my part, I believe that my contribution to this project has come to an end. I have already warned that I do not have the necessary skills and I find everything related to the discussion/design very difficult. It is better to leave that task to those of you who know it.

Farewell and thank you very much for your time and advice (especially to @Fish-Git).

Good luck and long live to Hercules!

JamesWekel commented 3 months ago

I'm working on the E6 z/vector instructions which has a lot of change to the infrastructure just as the E7 z/vector instructions did. My work is based on the sharedvfp branch. I'm hoping to be at a stable place late next week for a pull request for your review. It will need more review as ecpsvm.c implements E6 instructions for S370 which overlap with new E6 z/vector instructions.

The E6 instructions will be in zvector2.c. Rather than move vector decimal instructions to decimal.c, I was planning on changing some of the functions in decimal.c from static void to void with new function prototypes in opcode.h.

Do we have a consistent type definition for U128? For some instructions, I need to do 128 bit arithmetic.

Jim

Fish-Git commented 3 months ago

I'm working on the E6 z/vector instructions ...

Thank you, James! I still say you should consider becoming an official Hercules developer. Your contributions over the past many months (past year?) have been invaluable.

Do we have a consistent type definition for U128?

AFAIK, type U128 does not exist in Hercules. gcc and clang both support the __int128 type, but unfortunately Microsoft's compiler still does not (even though people have been complaining about it for years now).  :(

Fish-Git commented 3 months ago

For my part, I believe that my contribution to this project has come to an end. I have already warned that I do not have the necessary skills and I find everything related to the discussion/design very difficult. It is better to leave that task to those of you who know it.

We will miss you, Salva!  :(

Farewell and thank you very much for your time and advice (especially to @Fish-Git).

You are VERY welcome, Salva! We all thank you from the bottom of our hearts for all of the tremendous contributions you have made to Hercules! You are a true Herculean in my book! If you send me your full real name, I will be very happy to add you to our Herculeans list.

Good luck and long live to Hercules!

Abso-fricking-lutely!  :)))

mcisho commented 3 months ago

For my part, I believe that my contribution to this project has come to an end.

That's a pity, I thought you were doing a great job.

... I find everything related to the discussion/design very difficult.

Don't worry, you're not alone there.

JamesWekel commented 3 months ago

mcisho

As part of pull request [https://github.com/SDL-Hercules-390/hyperion/pull/661], I have enabled the following features in feat900.h:

#define FEATURE_134_ZVECTOR_PACK_DEC_FACILITY
#define FEATURE_135_ZVECTOR_ENH_FACILITY_1
#define FEATURE_148_VECTOR_ENH_FACILITY_2
#define FEATURE_152_VECT_PACKDEC_ENH_FACILITY
#define FEATURE_165_NNET_ASSIST_FACILITY
#define FEATURE_192_VECT_PACKDEC_ENH_2_FACILITY

as all/most of the E6 instructions are defined as part of or enhanced with these facilities. I suspect that is causing some of the windows build problems, as you are referencing FEATURE_135_ZVECTOR_ENH_FACILITY_1.

Hope I haven't caused too many problems, but I wanted to get the basics in for the E6 instructions to minimize merge conflicts.

Jim

Fish-Git commented 3 months ago

FYI: James's changes to the sharedvfp branch have been merged.

JamesWekel commented 3 months ago

The z/vector E6 instructions, for example VECTOR FP CONVERT TO NNP, reference NNP-Data-Type-1 Format. From z/Architecture Principles of Operation, SA22-7832-13, page 26-1 states:

Neural Network Processing Data

The NEURAL NETWORK PROCESSOR ASSIST
instruction, as well as the related convert instructions
described in this chapter, perform operations on
model-dependent data types.

NNP-Data-Type-1 Format

NNP-data-type-1 format represents a 16-bit signed
floating-point number in a proprietary format with a
range and precision tailored toward neural-network
processing. Other models may use other data formats.

But the NNP-data-type-1 format is not described. Does anyone have additional reference information on the format? The closest that I've found is a DLFLOAT presentation: https://pdfs.semanticscholar.org/5359/1b203af986668ca6586f80d30257d3ee52d7.pdf

Thanks, Jim

Fish-Git commented 3 months ago

Does anyone have additional reference information on the format?

I'm not aware of any, no. But then I haven't tried looking for it either.

The closest that I've found is a DLFLOAT presentation: https://pdfs.semanticscholar.org/5359/1b203af986668ca6586f80d30257d3ee52d7.pdf

THAT looks to me like that's probably it! Great find, James! I say go with it!

JamesWekel commented 1 month ago

Fish,

I've coded initial versions of the five E6 vector "neural network processing assist" instructions (VCNF, ...). As part of this implementation, I use SoftFloat f32_to_f16 and f16_to_f32 routines. But when I do a 'make', I received:

 CCLD     hercules
/usr/bin/ld: ./.libs/libherc.so: undefined reference to `f32_to_f16'
/usr/bin/ld: ./.libs/libherc.so: undefined reference to `f16_to_f32'

Whoa... Got to be my problem! Yep, the routines are in the source, the softfloat.h has prototypes... It took me quite a while to determine that the Hercules softfloat libraries only contain routines used by Hercules!

Why these routines? These vector instructions convert to/from Tiny (F16) Binary Floats.

I would appreciate if the softfloat libraries could be refreshed to include f32_to_f16 and f16_to_f32 routines.

Thanks, Jim

Fish-Git commented 1 month ago

I would appreciate if the softfloat libraries could be refreshed to include f32_to_f16 and f16_to_f32 routines.

10-4. I'll get right on it.

Can you provide for me your Hercules changes in the form of a patch, so I can test my softfloat changes before actually committing them?

That is to say, I'd like to try building Hercules with your changes for myself, so I can see (recreate) your reported link error, and then temporarily make my softfloat changes and then rebuild Hercules (with your changes again), to verify that the problem is now fixed.

Then I can commit my changes with absolute confidence to the softfloat repository.

Thanks.

JamesWekel commented 1 month ago

Fish,

I'll post a patch tomorrow. I'm in the middle of moving the zvector instructions to a new file nnpa.c which will include NNPA: Function Code 0: NNPA-QAF (Query Available Functions). All the NNPA stuff will then be in one source file.

Jim

JamesWekel commented 1 month ago

Fish,

As requested, here is a patch with my current nnpa.c with associated updates to 15 files.

As always, comments / suggestions are appreciated.

Jim

Fish-Git commented 1 month ago

As requested, here is a patch...

Thanks. I'm on it!

It looks like this "simple" change is going to take me longer than originally expected though. My first attempt to just move the f32_to_f16 and f16_to_f32 functions into "hercsource" directory (and update the sources.txt appropriately of course) failed with yet even more unresolved link errors:

softfloat_normSubnormalF16Sig  referenced in function f16_to_f32
softfloat_f16UIToCommonNaN     referenced in function f16_to_f32
softfloat_roundPackToF16       referenced in function f32_to_f16
softfloat_commonNaNToF16UI     referenced in function f32_to_f16

So now I'm going to have to do the same thing for the source files containing those functions too. I'm hoping this "simple" change doesn't end up snowballing into some huge complicated mess!

In any case, I'll let you know when I eventually have something for you to test with.

Fish-Git commented 1 month ago

SoftFloat fix committed!

"Fix for GitHub z/Arch Issue #650" Commit: c114c53e672d92671e0971cfbf8fe2bed3d5ae9e

Tested on both Windows and Linux (with your nnpa.patch applied): Both now build cleanly! (whereas before they got "unresolved" errors).

You should now be good to go!

Fish-Git commented 1 month ago

NOTE:

You will of course need to git update your SoftFloat external package repo and rebuild it in order for your libs directory to get updated with the new softfloat libs, so that Hercules links correctly. You know how to do that, yes? You just use the extpkgs script to either "update" or re-"clone" package "s" (i.e. softfloat). Enter "extpkgs /?" (or "extpkgs.sh --help") for more information.

Or you can simply use Bill's Hercules Helper, of course.

JamesWekel commented 1 month ago

Fish,

Thank you for the SoftFloat update..

I'm currently just using the SoftFloat X64 libraries that are part of the 'develop' branch. I have build the external packages but it has been a while.

Just to be clear, the 'develop' branch does not have updated SoftFloat libraries. Once I commit the nnpa code, everyone doing X86-64 development on the 'develop' branch will have to update their SoftFloat libraries with a new version from https://github.com/SDL-Hercules-390/SoftFloat.git. Is this going to cause some confusion/downstream support issues?

Jim

Fish-Git commented 1 month ago

I'm currently just using the SoftFloat X64 libraries that are part of the 'develop' branch.

Then you should be okay. The last commit I made was to update those lib files. So for Windows, you should be okay, as well as any x86 Linux user that is able to use the Herc libs.

It's just for some Linux users that might have to update and rebuild their softfloat repo/libs if they're unable to use the ones that come with Herc, such as those who have a non-x86 system (such as ARM for example).

Make sense?

Fish-Git commented 1 month ago

Just to be clear, the 'develop' branch does not have updated SoftFloat libraries.

That was true, but is now no longer true as of a couple hours ago, since, as I said, Herc's libraries have since been updated:

Fish-Git commented 1 month ago

Once I commit the nnpa code, everyone doing X86-64 development on the 'develop' branch will have to update their SoftFloat libraries with a new version from https://github.com/SDL-Hercules-390/SoftFloat.git.

ONLY if they're running on non-x86 hardware (or otherwise are unable to use the libs that come with Hercules).

Is this going to cause some confusion/downstream support issues?

Possibly.

If they build Herc themselves the hard way, then yes, they will have to update and rebuild their softfloat external package libraries.

If they build Herc using Bill's Hercules Helper however, then probably not. I believe Bill's Hercules Helper builds Hercules just fine for most all non-x86 systems. @wrljet Bill? Is that true? Does your script always refresh (git pull) for all of the external package repos each time? (and rebuild them if they've changed?)

But if they, like you, simply link with the libs that come delivered with Hercules, then no, they should be unaffected.

wrljet commented 1 month ago

Fish,

Hercules-Helper rebuilds the extpkgs from source, with a fresh git clone, on all systems except Windows. (unless --no-clone option is used)

Bill

JamesWekel commented 1 month ago

Fish,

Thank you. Thank you for your last commit to update the SoftFloat libraries! My nnpa.c code compiles and links. Now to work on some tests.

Whenever I've used hercules-helper to install hercules on my Raspberry PI 5, all the external packages are built.

Jim

mfsysprog commented 2 weeks ago

The z/vector E6 instructions, for example VECTOR FP CONVERT TO NNP, reference NNP-Data-Type-1 Format. From z/Architecture Principles of Operation, SA22-7832-13, page 26-1 states:

Neural Network Processing Data

The NEURAL NETWORK PROCESSOR ASSIST
instruction, as well as the related convert instructions
described in this chapter, perform operations on
model-dependent data types.

NNP-Data-Type-1 Format

NNP-data-type-1 format represents a 16-bit signed
floating-point number in a proprietary format with a
range and precision tailored toward neural-network
processing. Other models may use other data formats.

But the NNP-data-type-1 format is not described. Does anyone have additional reference information on the format? The closest that I've found is a DLFLOAT presentation: https://pdfs.semanticscholar.org/5359/1b203af986668ca6586f80d30257d3ee52d7.pdf

Thanks, Jim

I came across this patent from IBM that describes the whole workings of the neural networks assist processing. It seems it also explains the NNP-data-type-1.

https://patents.justia.com/patent/11669331

Edit: This links to a pdf version that also has the images: https://image-ppubs.uspto.gov/dirsearch-public/print/downloadPdf/11669331