Add support for C99 %a and %A printf format support.

joncampbell123 commented 10 months ago

This adds support for printing point constants in hexadecimal format using %a and %A. The code is written to mimic how GNU glibc does it.

Feel free to clean it up as needed.

joncampbell123 commented 10 months ago

This is to help resolve issue https://github.com/open-watcom/open-watcom-v2/issues/1066

joncampbell123 commented 10 months ago

Please see this to confirm what I am doing with the 64-bit double floating point type: https://en.wikipedia.org/wiki/Double-precision_floating-point_format

jmalak commented 10 months ago

It requires extend to scanf and for 80-bit long double. now it is only for printf double and float.

There is attribute SF_LONG_DOUBLE which handle 80-bit long double format.

NOTE: it must be part of math library to not load FP until floating data are used. it should be moved to EFG_printf in math library and be extended for %a, %la, %A and %LA formats. For scanf it is another story. it looks like the long double stuff is completely missing. It will require review and add long double stuff first.

I looked into math library and it looks like there is already implemented read of hexadecimal FP format in strtod.c to support C99. Anyway it will need review at all.

joncampbell123 commented 10 months ago

It requires extend to scanf and for 80-bit long double. now it is only for printf double and float.

There is attribute SF_LONG_DOUBLE which handle 80-bit long double format.

NOTE: it must be part of math library to not load FP until floating data are used. it should be moved to EFG_printf in math library and be extended for %a, %la, %A and %LA formats. For scanf it is another story. it looks like the long double stuff is completely missing. It will require review and add long double stuff first.

I looked into math library and it looks like there is already implemented read of hexadecimal FP format in strtod.c to support C99. Anyway it will need review at all.

The issue does not mention %la, it mentions only %a, that is what this pull is meant to resolve. Also my version does not need the math library :) I figured other variations could be added later, this is just the first step.

Handle this code however you think is best.

joncampbell123 commented 10 months ago

@jmalak Should I revise my pull request here with additional code changes or are you going to handle it?

I think the best way to full the request is to add %a support in whatever manner you think best and then worry about %la later. Since %a prints out the various fields of the float without needing to try to convert to decimal it really is possible to implement without needing the full floating point or math library and %la can do the same given the Intel 80-bit long double format as well. Or perhaps it might be appropriate to put into the math library anyway?

jmalak commented 10 months ago

The request is general without implementation knowledge, that it is not important what was exactly requested. There are still a few important C99 features missing which are more important then %a/%A format specifier. I am working on C99 compound literals and designated initializers which are most important, but it is going slowly, it is mainly about study of compiler code. C compiler is hand-crafted parser that there are many specific construct etc. To ensure correct C99 %a feature in OW C run-time library means to implement %a/%A and %La/%LA Sorry %la/%lA is wrong, it is not valid format specifiers. Implementation of support in scanf is not urgent, it can be done later with implementing long double in scanf. I think strtod has support for C99 hexadecimal FP constant that can be used if needed as workaround. If you have some time then it helps if you prepare processing of 80-bit long double with existing 64-bit version you created. I will integrated it to math library later and do some change necessary for integration with C run-time library there is internal CRTL flag for 80-bit or 64-bit FP support.

joncampbell123 commented 10 months ago

@jmalak I can't even get printf() to accept any long double parameters. They always seem to come in as type double!

See DOSLIB test program: https://github.com/joncampbell123/doslib/blob/master/hw/dos/testprna.c

GCC 9.3 output, x86_64, GNU GLIBC printf():

0x3ff0000000000000     f=1.000000 a=0x1p+0 A=0X1P+0
0x3fff8000000000000000 Lf=1.000000 La=0x8p-3 LA=0X8P-3
0x3ff8000000000000     f=1.500000 a=0x1.8p+0 A=0X1.8P+0
0x3fffc000000000000000 Lf=1.500000 La=0xcp-3 LA=0XCP-3
0x3ffc000000000000     f=1.750000 a=0x1.cp+0 A=0X1.CP+0
0x3fffe000000000000000 Lf=1.750000 La=0xep-3 LA=0XEP-3
0x4000000000000000     f=2.000000 a=0x1p+1 A=0X1P+1
0x40008000000000000000 Lf=2.000000 La=0x8p-2 LA=0X8P-2
0x4008000000000000     f=3.000000 a=0x1.8p+1 A=0X1.8P+1
0x4000c000000000000000 Lf=3.000000 La=0xcp-2 LA=0XCP-2
0x4010000000000000     f=4.000000 a=0x1p+2 A=0X1P+2
0x40018000000000000000 Lf=4.000000 La=0x8p-1 LA=0X8P-1
0x0000000000000000     f=0.000000 a=0x0p+0 A=0X0P+0
0x00000000000000000000 Lf=0.000000 La=0x0p+0 LA=0X0P+0
0xbff0000000000000     f=-1.000000 a=-0x1p+0 A=-0X1P+0
0xbfff8000000000000000 Lf=-1.000000 La=-0x8p-3 LA=-0X8P-3
0xbff8000000000000     f=-1.500000 a=-0x1.8p+0 A=-0X1.8P+0
0xbfffc000000000000000 Lf=-1.500000 La=-0xcp-3 LA=-0XCP-3
0xbffc000000000000     f=-1.750000 a=-0x1.cp+0 A=-0X1.CP+0
0xbfffe000000000000000 Lf=-1.750000 La=-0xep-3 LA=-0XEP-3
0xc000000000000000     f=-2.000000 a=-0x1p+1 A=-0X1P+1
0xc0008000000000000000 Lf=-2.000000 La=-0x8p-2 LA=-0X8P-2
0xc008000000000000     f=-3.000000 a=-0x1.8p+1 A=-0X1.8P+1
0xc000c000000000000000 Lf=-3.000000 La=-0xcp-2 LA=-0XCP-2
0xc010000000000000     f=-4.000000 a=-0x1p+2 A=-0X1P+2
0xc0018000000000000000 Lf=-4.000000 La=-0x8p-1 LA=-0X8P-1

Open Watcom compiled DOS program with this %a implementation (result is the same regardless of 16-bit, 32-bit):

0x070c3ff0000000000000 f=1.000000 a=0x1p+0 A=0X1P+0
0x3ff0000000000000     Lf=1.000000 La=0x1p+0 LA=0X1P+0
0x070c3ff8000000000000 f=1.500000 a=0x1.8p+0 A=0X1.8P+0
0x3ff8000000000000     Lf=1.500000 La=0x1.8p+0 LA=0X1.8P+0
0x070c3ffc000000000000 f=1.750000 a=0x1.cp+0 A=0X1.CP+0
0x3ffc000000000000     Lf=1.750000 La=0x1.cp+0 LA=0X1.CP+0
0x070c4000000000000000 f=2.000000 a=0x1p+1 A=0X1P+1
0x4000000000000000     Lf=2.000000 La=0x1p+1 LA=0X1P+1
0x070c4008000000000000 f=3.000000 a=0x1.8p+1 A=0X1.8P+1
0x4008000000000000     Lf=3.000000 La=0x1.8p+1 LA=0X1.8P+1
0x070c4010000000000000 f=4.000000 a=0x1p+2 A=0X1P+2
0x4010000000000000     Lf=4.000000 La=0x1p+2 LA=0X1P+2
0x070c0000000000000000 f=0.000000 a=0x0p+0 A=0X0P+0
0x0000000000000000     Lf=0.000000 La=0x0p+0 LA=0X0P+0
0x070cbff0000000000000 f=-1.000000 a=-0x1p+0 A=-0X1P+0
0xbff0000000000000     Lf=-1.000000 La=-0x1p+0 LA=-0X1P+0
0x070cbff8000000000000 f=-1.500000 a=-0x1.8p+0 A=-0X1.8P+0
0xbff8000000000000     Lf=-1.500000 La=-0x1.8p+0 LA=-0X1.8P+0
0x070cbffc000000000000 f=-1.750000 a=-0x1.cp+0 A=-0X1.CP+0
0xbffc000000000000     Lf=-1.750000 La=-0x1.cp+0 LA=-0X1.CP+0
0x070cc000000000000000 f=-2.000000 a=-0x1p+1 A=-0X1P+1
0xc000000000000000     Lf=-2.000000 La=-0x1p+1 LA=-0X1P+1
0x070cc008000000000000 f=-3.000000 a=-0x1.8p+1 A=-0X1.8P+1
0xc008000000000000     Lf=-3.000000 La=-0x1.8p+1 LA=-0X1.8P+1
0x070cc010000000000000 f=-4.000000 a=-0x1p+2 A=-0X1P+2
0xc010000000000000     Lf=-4.000000 La=-0x1p+2 LA=-0X1P+2

If the call really were passing a long double to printf() in OW, the second result given the current %a implementation would produce some erronous bogus output as a result of interpreting an 80-bit long double as 64-bit double.

So as it stands, it seems there's no point at this time in worrying about %La vs %a because at least as the test program shows, they're the same datatype.

Note the GCC GLIBC output makes sense in that the 80-bit format has the explicit leading bit that the 32/64-bit floats omit as the "implied bit". So it's showing the not-implied bit as 0x8 with an exponent.

jmalak commented 10 months ago

It is default setup that it uses 64-bit double as long double. Try to use -fld compiler option.

joncampbell123 commented 10 months ago

It is default setup that it uses 64-bit double as long double. Try to use -fld compiler option.

First, why doesn't that appear when I run the compiler and it prints out all the options?

Second... "Error! E1119: Internal compiler error 136"? :raised_eyebrow:

%write tmp.cmd -fr=nul -fo=dos86c/.obj -i.. -i"../.." -e=2 -zu -zq -mc -d0 -s -bt=dos -oilrb -os -wx -0 -dTARGET_MSDOS=16 -dMSDOS=1 -dTARGET86=86 -DMMODE=c -q -x -i"/usr/src/open-watcom-v2-upstream/rel/h" -fld testprna.c testprna.c(35): Error! E1119: Internal compiler error 136 Segmentation fault Error(E42): Last command making (dos86c/testprna.obj) returned a bad status Error(E02): Make execution terminated

joncampbell123 commented 10 months ago

That "Zoiks" (heh, someone likes Scooby Doo) is in ./cg/intel/i86/c/i86splt2.c line 357. It's the default case for N_CONSTANT because there isn't one for long double. It needs a case statement for "FL" (long double) below "FD" (double).

joncampbell123 commented 10 months ago

The segfault according to "WD" is in insutil.c line 134 in function FindNameConf(). Parameter "name" is NULL.

joncampbell123 commented 10 months ago

The 32-bit compiler wcc386 doesn't throw any obvious errors but then any function call that passes a long double as a parameter causes the program to crash after that point, whether or not that function does anything with it. At least when compiled as a 32-bit DOS program.

joncampbell123 commented 10 months ago

So to answer my own question, -fld isn't documented because 80-bit "long double" support within the compiler is totally broken. Right? :)

joncampbell123 commented 10 months ago

Even if I just pass the long double constant to printf() directly it still triggers that error (16-bit wcc) or causes the program to crash (32-bit wcc386). In the compiler's current state it is impossible to pass a long double to printf() and therefore it is pointless to try to implement %La or %LA any differently from %a and %A at this time.

jmalak commented 10 months ago

No, it must be ready to use 80-bit LD in CRTL. The CRTL support 80-bit LD even if compiler not yet. The same with location of code to math library. If it is in C library then it do standard part bigger even if double or long double is not used in application. It is critical on 16-bit to ensure library code be as small as possible without float support. It is reason to move it in the math library. As soon as you use FP number then you will be using math library to process such variable somehow etc. that it will load this stuff. Anyway 80-bit support is not broken it is not finished yet. -fld option is for use with 80-bit LD when CG will be ready. I will review your code and put it into math library to save space in CRTL (mainly for 16-bit). I will add code for 80-bit LD %La and %LA even if it will not used in application but can be used for CRTL 80-bit LD testing and it will be complete C99 compliant support %LA and %La for long double type even if long double can be 64-bit or 80-bit.

jmalak commented 9 months ago

I reimplemented it to math library including long double

open-watcom / open-watcom-v2

Add support for C99 %a and %A printf format support. #1198