Closed scottshotgg closed 4 years ago
Hi Scott,
Thanks for the detailed report!
The representation of floating-point numbers is not yet up to pair with LLVM; so in that sense it is expected behaviour. It definitely should not be expected, and leading up to a 1.0 release of llir/llvm this has to be resolved. So, good that we now have an issue to refer to, for the implementation of floating-point values.
Any help implementing support for this would be greatly appreciated. We've given it a few attempts, but so far only cover a subset of the valid representations.
The most relevant code is currently located in https://github.com/llir/llvm/tree/master/internal/floats It could be expanded, more thoroughly tested and eventually made complete, for float80, float128, float16, etc.
The current implementation of floating-point constants is riddled with TODO
notes, see for instance https://github.com/llir/llvm/blob/master/ir/constant/float.go#L50
case strings.HasPrefix(s, "0xK"):
// TODO: Implement support for the 0xK floating-point representation format.
For your specific use case of float
values, see https://github.com/llir/llvm/blob/master/ir/constant/float.go#L121
I wish we were in a better state handling floating-point values, but the work has just not yet been done. So, for this we warmly invite you to help take a stab at it :)
I'd be glad to help you get acquainted with the code base if you'd like.
The end goal is to support encoding and decoding of these floating-point types (including their hexadecimal representations):
half
: 16-bit floating point type
float
: 32-bit floating point type
double
: 64-bit floating point type
fp128
: 128-bit floating point type (112-bit mantissa)
x86_fp80
: 80-bit floating point type (x87)
ppc_fp128
: 128-bit floating point type (two 64-bits, PowerPC)
Cheerful regards from a sunny Sweden, Robin
P.S. as a small side note, and as outlined in issue #29 for the upcoming release, we've been working recently to have a grammar that covers the entire LLVM IR language. As outlined, the repo https://github.com/mewmew/l has been used during the experimental phase and its grammar will be merged back into the llir/llvm repo once the code base matures. Some more work on float80 has been done at https://github.com/mewmew/l/tree/master/internal/float80 but I don't think it's complete. So you can simply take a look, and then use the resources that you find to come up with a more complete implementation.
I'd be happy to contribute to the community/project. Do you have an irc, discord, slack, etc that you are using to coordinate efforts?
I'd be happy to contribute to the community/project. Do you have an irc, discord, slack, etc that you are using to coordinate efforts?
That's lovely to hear!
Feel free to join us on Gitter at https://gitter.im/llir/Lobby (you can sign in on Gitter using your GitHub account, so no need to create a new one)
P.S. I'm in the midst of exams at Uni, so may take a few days to reply. Very happy to help out and get you up to speed, just so you know! :)
Cheers, /u
Hi @scottshotgg!
Just checking up. Did you get a chance to play around with floating-point representations for LLVM IR?
Also, I sent you an invite to the Gitter chat at https://gitter.im/llir/Lobby. Feel free to join.
Cheers, /u
I may play a bit with this: https://github.com/mewmew/floats
Let me know if you already started working on it @scottshotgg and we can sync our progress.
Hey @mewmew,
Sorry for the late reply. Unfortunately, for now I have decided to forgo trying to generate LLVM IR in Go. I will instead transpile my language (https://github.com/scottshotgg/Express/tree/rearch) to C++ and invoke Clang from there to do the translation to a binary. With that being said, I am always up to help and this project seems very interesting. I'd also like to get your take on some language design choices and usage of LLVM, garbage collection, etc. Hit me up on Discord under the same name and we can talk a bit more. I'd like to hear what other things you're interested in.
Peace
@scottshotgg Just a heads up, as of rev 468cda76b3a8274743311eb7736703262b88950b, proper support for hexadecimal floating-point constants has been implemented.
As an example, the LLVM IR module:
define void @f() {
%1 = alloca float
store float 0x3FF19999A0000000, float* %1
ret void
}
was generated by the following program:
package main
import (
"fmt"
"github.com/llir/llvm/ir"
"github.com/llir/llvm/ir/constant"
"github.com/llir/llvm/ir/types"
)
func main() {
m := ir.NewModule()
f := m.NewFunction("f", types.Void)
entry := f.NewBlock("")
dst := entry.NewAlloca(types.Float)
src := constant.NewFloat(types.Float, 1.1)
entry.NewRet(nil)
entry.NewStore(src, dst)
// Output module in LLVM IR syntax.
fmt.Println(m)
}
Cheers, Robin
Edit: note, support is currently added for float
, double
and x86_fp80
. Support is yet to be added for half
, fp128
and ppc_fp128
. The list in https://github.com/llir/llvm/issues/31#issuecomment-393438232 has been updated to reflect this.
Edit2: half
is now supported.
Anyone wishing to contribute to add support for the remaining two floating-point representations, consider yourself warmly invited to :) We would like some help to complete these implementations.
The decoding and encoding of these floating-point representations are implemented in a dedicated repository, as this functionality may be useful outside of the LLVM library.
The two floating-point representations remaining are:
fp128
: 128-bit floating point type (112-bit mantissa) (preliminary implementation in binary128 package)ppc_fp128
: 128-bit floating point type (two 64-bits, PowerPC) (preliminary implementation in float128ppc package)Feel free to ask us anything to get up to speed.
Cheers, Robin
Edit: note, the binary16 and the float80x86 packages may be used for reference.
@scottshotgg I know you are playing with other things these days. Let me know if you'd like to chat some day and bounce a few ideas. I have Discord these days, so may be a bit easier to reach.
Hey @mewmew,
(my cat closed the issue :p)
Nice to hear this is being worked again. Although I do have some other development going on, I definitely am interested in helping and talking/designing. I had switched to just generating C/C++ code for my effective LLVM generation as I didn't really need to get too specific with that yet. Hit me up on discord though, same name, will definitely be easier to talk that way.
(my cat closed the issue :p)
Haha, that's lovely <3
Hit me up on discord though, same name, will definitely be easier to talk that way.
Will do!
I tried to use @scottshotgg as user name, but could not add as a #0000
suffix DiscordTag was needed for the username. Would you mind sending me your tag?
Ah, my last four are #2053
, forgot about that
Ah, my last four are #2053, forgot about that
I added you on Discord. Would be happy to chat another day. Now it's starting to get quite late. Sleep well :)
Support for the 0xL
and 0xM
hexadecimal floating-point representations have not yet been implemented.
@dannypsnl, would you like to take a look at this? :)
To try, use e.g.
constant.NewFloatFromString("0xL00000000000000000000000000000000")
constant.NewFloatFromString("0xL00000000000000007FFF000000000000")
constant.NewFloatFromString("0xM00000000000000000000000000000000")
constant.NewFloatFromString("0xM400F000000000000BCB0000000000000")
I have summarized the 0xL
and 0xM
prefixes from test cases below:
Sure, I would take a look later. :)
On Sat, Dec 7, 2019 at 10:38 AM Robin Eklind notifications@github.com wrote:
Support for the 0xL and 0xM hexadecimal floating-point representations have not yet been implemented.
@dannypsnl https://github.com/dannypsnl, would you like to take a look at this? :)
To try, use e.g.
constant.NewFloatFromString("0xL00000000000000000000000000000000") constant.NewFloatFromString("0xL00000000000000007FFF000000000000")
constant.NewFloatFromString("0xM00000000000000000000000000000000") constant.NewFloatFromString("0xM400F000000000000BCB0000000000000")
I have summarized the 0xL and 0xM prefixes from test cases below:
0xL
prefixconstant.NewFloatFromString("0xL0") constant.NewFloatFromString("0xL00000000000000000000000000000000") constant.NewFloatFromString("0xL00000000000000000001000000000000") constant.NewFloatFromString("0xL00000000000000003FFF000000000000") constant.NewFloatFromString("0xL00000000000000003fff000001000000") constant.NewFloatFromString("0xL00000000000000003fff000002000000") constant.NewFloatFromString("0xL00000000000000004000000000000000") constant.NewFloatFromString("0xL00000000000000004001400000000000") constant.NewFloatFromString("0xL00000000000000004004C00000000000") constant.NewFloatFromString("0xL00000000000000004201000000000000") constant.NewFloatFromString("0xL00000000000000005001000000000000") constant.NewFloatFromString("0xL00000000000000007FFF000000000000") constant.NewFloatFromString("0xL00000000000000007FFF800000000000") constant.NewFloatFromString("0xL00000000000000008000000000000000") constant.NewFloatFromString("0xL00000000000000013fff000000000000") constant.NewFloatFromString("0xL00000000000000018000000000000000") constant.NewFloatFromString("0xL000fffff00000000000fffff00000000") constant.NewFloatFromString("0xL00ff00ff00ff00ff00ff00ff00ff00ff") constant.NewFloatFromString("0xL01") constant.NewFloatFromString("0xL08000000000000003fff000000000000") constant.NewFloatFromString("0xL300000000000000040089CA8F5C28F5C") constant.NewFloatFromString("0xL5000000000000000400E0C26324C8366") constant.NewFloatFromString("0xL8000000000000000400A24E2E147AE14") constant.NewFloatFromString("0xL999999999999999A3FFB999999999999") constant.NewFloatFromString("0xLEB851EB851EB851F400091EB851EB851") constant.NewFloatFromString("0xLF000000000000000400808AB851EB851") constant.NewFloatFromString("0xLf8f8f8f8f8f8f8f8f8f8f8f8f8f8f8f8")
0xM
prefixconstant.NewFloatFromString("0xM00000000000000000000000000000000") constant.NewFloatFromString("0xM3DF00000000000000000000000000000") constant.NewFloatFromString("0xM3FF00000000000000000000000000000") constant.NewFloatFromString("0xM40000000000000000000000000000000") constant.NewFloatFromString("0xM400C0000000000300000000010000000") constant.NewFloatFromString("0xM400F000000000000BCB0000000000000") constant.NewFloatFromString("0xM403B0000000000000000000000000000") constant.NewFloatFromString("0xM405EDA5E353F7CEE0000000000000000") constant.NewFloatFromString("0xM4093B400000000000000000000000000") constant.NewFloatFromString("0xM41F00000000000000000000000000000") constant.NewFloatFromString("0xM4D436562A0416DE00000000000000000") constant.NewFloatFromString("0xM80000000000000000000000000000000") constant.NewFloatFromString("0xM818F2887B9295809800000000032D000") constant.NewFloatFromString("0xMC00547AE147AE1483CA47AE147AE147A")
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/llir/llvm/issues/31?email_source=notifications&email_token=AFH4GH4KOHCKPL22DMRLSQ3QXMEBXA5CNFSM4FCRXVZ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEGF3ZOQ#issuecomment-562805946, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFH4GH5CVNWHY4NSXQ7EGGDQXMEBXANCNFSM4FCRXVZQ .
@mewmew When I am adding parse code for 0xL
:
// From https://llvm.org/docs/LangRef.html#simple-constants
// > The IEEE 128-bit format is represented by 0xL followed by 32 hexadecimal digits.
hex := s[len("0xL"):]
if len(hex) < 32 {
hex = strings.Repeat("0", 32-len(hex)) + hex
}
signAndExponent := hex[:16]
fraction := hex[16:]
a, err := strconv.ParseUint(signAndExponent, 16, 64)
if err != nil {
return nil, errors.WithStack(err)
}
b, err := strconv.ParseUint(fraction, 16, 64)
if err != nil {
return nil, errors.WithStack(err)
}
f := binary128.NewFromBits(a, b)
I found a few things that need your help:
github.com/mewmew/float/binary128
is for IEEE 754 128-bit format, but I can't find the Big()
method under binary128.Float
0xL00000000000000000000000000000001
equal to 0xL01
? Which means I can simply add ignored 0
into the string that shorter than 32.are also two floating-point constants in hexadecimal notation which belong to this issue. One aspect that make these test cases especially interesting, is that their underlying hexadecimal literal is too large to be represented in 64-bit, which is why we currently see failures in strconv.ParseUint
. ref: https://github.com/llir/llvm/issues/111#issuecomment-562471872 and https://github.com/llir/llvm/issues/111#issuecomment-562928305
From llvm/test/CodeGen/PowerPC/fast-isel-call.ll
:
call void @double_foo(double 0x1397723CCABD0000401C666660000000)
From llvm/test/Transforms/InstCombine/fneg.ll:
%m = fmul <4 x double> %x, <double 42.0, double 0x7FF80000000000000, double 0x7FF0000000000000, double undef>
@mewmew When I am adding parse code for 0xL:
// From https://llvm.org/docs/LangRef.html#simple-constants // > The IEEE 128-bit format is represented by 0xL followed by 32 hexadecimal digits. hex := s[len("0xL"):] if len(hex) < 32 { hex = strings.Repeat("0", 32-len(hex)) + hex } signAndExponent := hex[:16] fraction := hex[16:] a, err := strconv.ParseUint(signAndExponent, 16, 64) if err != nil { return nil, errors.WithStack(err) } b, err := strconv.ParseUint(fraction, 16, 64) if err != nil { return nil, errors.WithStack(err) } f := binary128.NewFromBits(a, b)
@dannypsnl, the parsing code looks good!
I found a few things that need your help:
github.com/mewmew/float/binary128 is for IEEE 754 128-bit format, but I can't find the Big() method under binary128.Float
Yeah. This is a know issue :) binary128 is not yet fully implemented: https://github.com/mewmew/float/issues/9. @alexpantyukhin and I have been working on the various floating-point representations, but they still need some love.
@dannypsnl if you want to try and help, then have a look at binary16 as a reference. It's complete and well tested. It also has the methods (e.g. Big()
that we want all floating-point implementations to have.
does 0xL00000000000000000000000000000001 equal to 0xL01? Which means I can simply add ignored 0 into the string that shorter than 32.
I think they should be the same, would have to verify against the official LLVM to know for sure.
Cheers, Robin
github.com/mewmew/float/binary128 is for IEEE 754 128-bit format, but I can't find the Big() method under binary128.Float
@dannypsnl, any updates? :) I saw you forked the float repo the other day. No rush, I'm just excited to make progress.
https://github.com/llir/llvm/issues/31#issuecomment-565886102
I just start working on it, my family came with me last weekend so I didn't do anything :).
// Signbit reports whether f is negative or negative 0.
func (f Float) Signbit() bool {
// 0b1000000000000000
// first bit is sign bit
return f.a&0x8000 != 0
}
// Exp returns the exponent of f.
func (f Float) Exp() int {
// remove 48 bits
return int(f.a & 0x7C00 >> 48)
}
// Frac returns the fraction of f.
func (f Float) Frac() uint16 {
// FIXME: use f.a and f.b
return f.a & 0x03FF
}
I just start working on it, my family came with me last weekend so I didn't do anything :).
No worries :) Family and friends come first! Btw, @dannypsnl do you have Discord for chat?
https://github.com/llir/llvm/issues/31#issuecomment-565952413
Just register one, dannypsnl#0805
as GitHub ID.
Just register one, dannypsnl as GitHub ID.
Great. Do you know the four digits at the end of the user name? (e.g. dannypsnl#1234
)
Just register one, dannypsnl as GitHub ID.
Great. Do you know the four digits at the end of the user name? (e.g.
dannypsnl#1234
)
Updated, didn't know it required that number www
Great, I added you now on Discord.
@mewmew I take a look at the documentation again, it says
from: https://llvm.org/docs/LangRef.html#id1130 The 128-bit format used by PowerPC (two adjacent doubles) is represented by 0xM followed by 32 hexadecimal digits.
and
from: https://llvm.org/docs/LangRef.html#id1118 ppc_fp128 | 128-bit floating-point value (two 64-bits)
They point out 64-bits and doubles, which seems to use two float64
to build ppc_fp128
.
So I compare it with https://en.wikipedia.org/wiki/IBM_hexadecimal_floating_point#Extended-precision_128-bit, I think has reason to believe they are different thing according to this: https://en.wikipedia.org/wiki/Long_double. At the Implementations section, it actually says
On some PowerPC and SPARCv9 machines,[citation needed] long double is implemented as a double-double arithmetic, where a long double value is regarded as the exact sum of two double-precision values, giving at least a 106-bit precision; with such a format, the long double type does not conform to the IEEE floating-point standard.
The point was double-double arithmetic, it seems very like what we are looking for, two adjacent doubles!
I thought the section Double-double arithmetic in https://en.wikipedia.org/wiki/Quadruple-precision_floating-point_format#Double-double_arithmetic is ppc_fp128
in LLVM. But I cannot sure, probably you already know this?
The point was double-double arithmetic, it seems very like what we are looking for, two adjacent doubles!
Yes, this definitely seems to the what ppc_fp128
corresponds to. Thanks for doing the research on this issue :)
From lib/Support/APFloat.cpp:
/* The IBM double-double semantics. Such a number consists of a pair of IEEE
64-bit doubles (Hi, Lo), where |Hi| > |Lo|, and if normal,
(double)(Hi + Lo) == Hi. The numeric value it's modeling is Hi + Lo.
Therefore it has two 53-bit mantissa parts that aren't necessarily adjacent
to each other, and two 11-bit exponents.
...
*/
A lot of progress has happened since last. @dannypsnl finalized the support for fp128
and ppc_fp128
.
Now, we only have a few test cases to sort out, see below:
$ go test -run=TestNewFloatFromStringFor
2019/12/25 17:41:14 unable to represent floating-point constant 3.8749999999999997779553950749687 of type ppc_fp128 exactly; please submit a bug report to llir/llvm with this error message
2019/12/25 17:41:14 unable to represent floating-point constant -3.63486592221937144421782551193426e-301 of type ppc_fp128 exactly; please submit a bug report to llir/llvm with this error message
2019/12/25 17:41:14 unable to represent floating-point constant -2.66 of type ppc_fp128 exactly; please submit a bug report to llir/llvm with this error message
--- FAIL: TestNewFloatFromStringForPPCFP128 (0.00s)
const_float_test.go:36: "0xM400C0000000000300000000010000000": floating-point value string mismatch; expected "0xM400C0000000000300000000010000000", got "0xM400C0000000000300000000000000000"
const_float_test.go:36: "0xM400F000000000000BCB0000000000000": floating-point value string mismatch; expected "0xM400F000000000000BCB0000000000000", got "0xM400F0000000000000000000000000000"
const_float_test.go:36: "0xM80000000000000000000000000000000": floating-point value string mismatch; expected "0xM80000000000000000000000000000000", got "0xM00000000000000000000000000000000"
const_float_test.go:36: "0xM818F2887B9295809800000000032D000": floating-point value string mismatch; expected "0xM818F2887B9295809800000000032D000", got "0xM818F2887B92958090000000000000000"
const_float_test.go:36: "0xMC00547AE147AE1483CA47AE147AE147A": floating-point value string mismatch; expected "0xMC00547AE147AE1483CA47AE147AE147A", got "0xMC00547AE147AE1480000000000000000"
--- FAIL: TestNewFloatFromStringForFP128 (0.00s)
const_float_test.go:82: "0xL00000000000000003fff000001000000": floating-point value string mismatch; expected "0xL00000000000000003fff000001000000", got "0xL00000000000000003FFF000001000000"
const_float_test.go:82: "0xL00000000000000003fff000002000000": floating-point value string mismatch; expected "0xL00000000000000003fff000002000000", got "0xL00000000000000003FFF000002000000"
const_float_test.go:82: "0xL00000000000000013fff000000000000": floating-point value string mismatch; expected "0xL00000000000000013fff000000000000", got "0xL00000000000000013FFF000000000000"
const_float_test.go:82: "0xL000fffff00000000000fffff00000000": floating-point value string mismatch; expected "0xL000fffff00000000000fffff00000000", got "0xL000FFFFF00000000000FFFFF00000000"
const_float_test.go:82: "0xL00ff00ff00ff00ff00ff00ff00ff00ff": floating-point value string mismatch; expected "0xL00ff00ff00ff00ff00ff00ff00ff00ff", got "0xL00FF00FF00FF00FF00FF00FF00FF00FF"
const_float_test.go:82: "0xL08000000000000003fff000000000000": floating-point value string mismatch; expected "0xL08000000000000003fff000000000000", got "0xL08000000000000003FFF000000000000"
const_float_test.go:82: "0xLf8f8f8f8f8f8f8f8f8f8f8f8f8f8f8f8": floating-point value string mismatch; expected "0xLf8f8f8f8f8f8f8f8f8f8f8f8f8f8f8f8", got "0xLF8F8F8F8F8F8F8F8F8F8F8F8F8F8F8F8"
FAIL
exit status 1
FAIL github.com/llir/llvm/ir/constant 0.003s
From 99fba125021e350598916eaef3a2a89b23c86371:
Also, extend TestNewFloatFromStringFor{PPCFP128,FP128} test cases to include a round-trip check. A few of these test cases are currently failing.
For ppc_fp128, the reason is that we still need to find a good way to implement float128ppc.NewFromBig which also stores half of the mantissa bits in low. Currently only high is used. See mewmew/float@5029c96 for more details.
For fp128, I'm not quite sure yet why the test cases are failing as I haven't looked into it.
@dannypsnl, do you want to take a look at the test cases of TestNewFloatFromStringForFP128
?
Sure, I can take a look at tomorrow. I also feel the implementation is not correct after researching again, the fraction can be shared, but sign and exp? If we use sum of two float then zero has multiple ways to represent, seems like using 106 bits as the fraction is more like target, but need to find some implementations to ensure it.
Robin Eklind notifications@github.com schrieb am Do. 26. Dez. 2019 um 12:47 AM:
@dannypsnl https://github.com/dannypsnl, do you want to take a look at the test cases of TestNewFloatFromStringForFP128?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/llir/llvm/issues/31?email_source=notifications&email_token=AFH4GH2K7KFBHXAEICAQKNTQ2OFBBA5CNFSM4FCRXVZ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEHUPKMQ#issuecomment-568915250, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFH4GHYRPFYI6K6HXHD3RGLQ2OFBBANCNFSM4FCRXVZQ .
--- FAIL: TestNewFloatFromStringForFP128 (0.00s)
const_float_test.go:82: "0xL00000000000000003fff000001000000": floating-point value string mismatch; expected "0xL00000000000000003fff000001000000", got "0xL00000000000000003FFF000001000000"
const_float_test.go:82: "0xL00000000000000003fff000002000000": floating-point value string mismatch; expected "0xL00000000000000003fff000002000000", got "0xL00000000000000003FFF000002000000"
const_float_test.go:82: "0xL00000000000000013fff000000000000": floating-point value string mismatch; expected "0xL00000000000000013fff000000000000", got "0xL00000000000000013FFF000000000000"
const_float_test.go:82: "0xL000fffff00000000000fffff00000000": floating-point value string mismatch; expected "0xL000fffff00000000000fffff00000000", got "0xL000FFFFF00000000000FFFFF00000000"
const_float_test.go:82: "0xL00ff00ff00ff00ff00ff00ff00ff00ff": floating-point value string mismatch; expected "0xL00ff00ff00ff00ff00ff00ff00ff00ff", got "0xL00FF00FF00FF00FF00FF00FF00FF00FF"
const_float_test.go:82: "0xL08000000000000003fff000000000000": floating-point value string mismatch; expected "0xL08000000000000003fff000000000000", got "0xL08000000000000003FFF000000000000"
const_float_test.go:82: "0xLf8f8f8f8f8f8f8f8f8f8f8f8f8f8f8f8": floating-point value string mismatch; expected "0xLf8f8f8f8f8f8f8f8f8f8f8f8f8f8f8f8", got "0xLF8F8F8F8F8F8F8F8F8F8F8F8F8F8F8F8"
I think these cases seem like just formatting problems.
I think these cases seem like just formatting problems.
Oh, you are right. Upper vs. lower-case. Good catch :)
Edit: fixed in rev 91340f2d089192eea08f26b20fca8f0bcb6c1a82.
Fixed a few of the failing ppc_fp128
test cases. Now we only have a single test case left :)
--- FAIL: TestNewFloatFromStringForPPCFP128 (0.00s)
const_float_test.go:36: "0xM400C0000000000300000000010000000": floating-point value string mismatch; expected "0xM400C0000000000300000000010000000", got "0xM400C0000000000300000000000000000"
FAIL
FAIL github.com/llir/llvm/ir/constant 3.210s
Fixed a few of the failing
ppc_fp128
test cases. Now we only have a single test case left :)--- FAIL: TestNewFloatFromStringForPPCFP128 (0.00s) const_float_test.go:36: "0xM400C0000000000300000000010000000": floating-point value string mismatch; expected "0xM400C0000000000300000000010000000", got "0xM400C0000000000300000000000000000" FAIL FAIL github.com/llir/llvm/ir/constant 3.210s
Did a quick debugging session, and it seems to be due to precision loss.
diff --git a/float128ppc/float128ppc.go b/float128ppc/float128ppc.go
index 9fdc9b3..27f0915 100644
--- a/float128ppc/float128ppc.go
+++ b/float128ppc/float128ppc.go
@@ -12,7 +12,7 @@ import (
const (
// precision specifies the number of bits in the mantissa (including the
// implicit lead bit).
- precision = 106
+ precision = 1048
)
Changing precision
to 1048
makes the test cases pass, but they still fail for precision = 1047
.
Any ideas on how to fix this without extending the precision beyond 106
(or 107
)?
The problem happened, in this case, was because the sum of high
and low
is not exact, I create an example for it:
package main
import (
"fmt"
"math/big"
"math"
)
func main() {
precision := uint(106)
// Operate on numbers of different precision.
var z big.Float
x := big.NewFloat(math.Float64frombits(0x0000000010000000)).SetPrec(precision)
y := big.NewFloat(math.Float64frombits(0x400C000000000030)).SetPrec(precision)
z.SetPrec(precision)
z.Add(x, y)
fmt.Printf("x = %.10g (%s, prec = %d, acc = %s)\n", x, x.Text('p', 0), x.Prec(), x.Acc())
fmt.Printf("y = %.10g (%s, prec = %d, acc = %s)\n", y, y.Text('p', 0), y.Prec(), y.Acc())
fmt.Printf("z = %.10g (%s, prec = %d, acc = %s)\n", &z, z.Text('p', 0), z.Prec(), z.Acc())
}
// Result:
// x = 1.326247369e-315 (0x.8p-1045, prec = 106, acc = Exact)
// y = 3.5 (0x.e000000000018p+2, prec = 106, acc = Exact)
// z = 3.5 (0x.e000000000018p+2, prec = 106, acc = Below)
can see z
is Below
not Exact
, then the question would be, it's correct? Since we give it correct precision, so the drop is as expected?
can see z is Below not Exact, then the question would be, it's correct? Since we give it correct precision, so the drop is as expected?
Good question. Do we have any floating-point experts among us? :) From the previous test, we need to extend the precision a lot to be able to get exact results for the addition. Namely, we need precision = 1048
instead of precision = 106
.
Thanks for troubleshooting @dannypsnl!
so the drop is as expected?
Normally, yes - these are 64-bit floats and the precision does not cover that resolution. However, with double-double arithmetic, these values do not correspond to the same bits as just adding two floats together. According to Wikipedia (4), it should be possible to represent something like e-1074.
Just my thoughts after looking into it, I come across a few different things:
Taking a look at double-double precision on Wikipedia (4) and a Go double-double lib (2), it does not appear that the floats are simply just added together, they need to be dissected as each one does not contain symmetric significand and mantissa values. This is consistent with examples shown in (1) and (3) as well.
Also, looking at Go's math.big
, it implements arbitrary precision arithmetic, not double-double so I don't think we can be sure if that example is supposed to be correct.
I have also referenced the Julia DoubleDouble lib as I think that can provide an example for use cases and implementations.
1) https://stackoverflow.com/questions/9857418/double-double-precision-floating-point-as-sum-of-two-doubles 2) https://gist.github.com/grd/4050062#file-float128-go-L372 3) https://github.com/JuliaMath/DoubleDouble.jl 4) https://en.wikipedia.org/wiki/Quadruple-precision_floating-point_format#Double-double_arithmetic
I am not sure how much this helps, but just some thoughts and links for direction.
Thanks for researching this @scottshotgg!
Also, looking at Go's
math.big
, it implements arbitrary precision arithmetic, not double-double so I don't think we can be sure if that example is supposed to be correct.
Right, I was hoping we could just rely on extending the precision of the high
and low
floats to 106 bits using math/big
and that the rest would follow.
I am not sure how much this helps, but just some thoughts and links for direction.
Definitely useful! Thanks for the links and your thoughts on this.
As for this issue, I think we can close it and start a new more specific one related to the precision loss of ppc_fp128
constants. I'd very much like to have perfect round-trips of ppc_fp128
constants without precision loss, and it's definitely a bug we need to resolve at some point.
That being said, I'd personally be ok with postponing fixing the ppc_fp128
precision loss issue until a future release™ and focus on getting the v0.3
release finalized.
@scottshotgg, @dannypsnl, @pwaller, what do you think? Is the ppc_fp128
precision loss a blocker for the v0.3
release? If not, I'll try to have v0.3
tagged before the end of the year :)
Cheerful regards from a snow-filled Sweden.
edit: issue #124 created.
Is the ppc_fp128 precision loss a blocker for the v0.3 release? If not, I'll try to have v0.3 tagged before the end of the year :)
No, I don't think it would be a blocker, postponing fixing is ok.
I agree with @dannypsnl, I don't think this is important for the release. PPC is already very obscure and I would be surprised if anyone actually uses 128-bit floats in combination.
Great, I'll close this issue then, and let #124 (which is targetted for a future release) track the precision loss of ppc_fp128
constants.
Hello all,
I would first like to say thanks for an amazing library and I appreciate the hard work and dedication to support LLVM through Go.
However, using the library in a project of my own to generate LLVM instructions for different variables, I ran into a problem where using
NewFloat
did not produce the expected results. I will demonstrate using a C program as a comparison.Using
clang -S -emit-llvm main.c
on the following program:Input C Program:
Produces the following store instruction for the float variable:
This is a 64 bit float with the last 28 bits dropped and converted to hex. (According to: http://lists.llvm.org/pipermail/llvm-dev/2011-April/039811.html)
However, attempting to generate the same instruction using this library:
where
value
is the float literal1.1
.I obtained the following instruction:
Putting this into LLVM to generate assembly using:
Generates an error of:
I can provide more information if needed, but a few questions:
1) Is this expected behavior? 2) Should this be expected behavior? 3) If yes, why does this produce code that doesn't work?
I can get it to work using
types.Double
, and if that is the solution then so be it for now, but I'd like to investigate if this is actually the expected output.Again, Thanks for the work and dedication