exercism / euphoria

Exercism exercises in Euphoria.
https://exercism.org/tracks/euphoria
MIT License
3 stars 8 forks source link

Add extra docs for grains #63

Closed glennj closed 3 months ago

glennj commented 5 months ago

@petelomax I'm interested in what you have to say about this: have I got it right?

Some of the wording is taken directly from the same topic I wrote about for the jq track

axtens commented 5 months ago

Could ask @ghaberek too

ghaberek commented 5 months ago

Unfortunately the manual still mentions 32-bit values in many places, both online and in the release bundles. As of 4.1 and later, Euphoria is now 64-bit capable so the maximum size of the integer type is now much larger on those platforms. Simply put, the integer type can store n-1 signed bits because the upper bit is used to flag object types, namely atom and sequence. See source/execute.h for the implementation details.

ARCH MININT MAXINT RANGE
32-bit 0xC0000000 0x3FFFFFFF -1,073,741,824 to 1,073,741,823
64-bit 0xC000000000000000 0x3FFFFFFFFFFFFFFF -4,611,686,018,427,387,904 to 4,611,686,018,427,387,903
petelomax commented 5 months ago

Oh, hang on: the max int sizes above are pretty irrelevant. You need to use atoms, or a bigint libary of some kind or maybe even strings! The docs should certainly make it clear you might struggle with native numbers: Under 32-bit atoms are 64-bit floats which means only 53 bits of precision and a maximum (accurate) integer value of 9,007,199,254,740,992 - which is not enough for the final part. Under 64-bit atoms are 80-bit floats with a precision of 64 bits and a maximum (accurate) integer value of 18,446,744,073,709,551,616 - just one more than the final answer. A "64-bit Euphoria integer" is a factor of 4 too small.

Now that I've thought of it, I think (obviously as well as atom answers) it would be pretty cool to accept submissions which gave string answers! I've posted a handful of rosettacode entries that perform string math, powers of 2 and -1 ain't exactly hard, plus Phix (at least) is blessed with a [s]printf() that produces impressively accurate exact powers of 2 (as per C++/boost). Lastly t_grains.e could avoid having (/refuse) to deal wth gmp/bigatom/hugeint and everything else, without banning their use. Of course it is just eg test_true("returns the total number of grains on the board", find(totalgrains(), {18446744073709551615, "18446744073709551615"}))

Sadly, most of what is in append.instructions.md is simply wrong for this task, and if at all possible it should link to some official docs for that level of detail anyway. For Phix I'd probably use http://phix.x10.mx/docs/html/atoms.htm but I cannot find anything like that in https://openeuphoria.org/docs/

glennj commented 4 months ago

It seems like there's something particular about the number power(2,63). I've been experimenting with the test suite and that number is the only one that needs to be expressed as a float.

test_equal("grains on square 62",2_305_843_009_213_693_952,square(62))
test_equal("grains on square 63",4_611_686_018_427_387_904,square(63))

--test_equal("grains on square 64",9_223_372_036_854_775_808,square(64))
procedure test_square64()
    atom actual = square(64)

    atom expected = 9_223_372_036_854_775_808
    print(1, {expected, sprintf("%21.0f", expected)})
    puts(1, "\n")
    test_equal("grains on square 64", expected, actual)   -- FAILS

    atom asfloat = 9_223_372_036_854_775_808.0
    print(1, {asfloat, sprintf("%21.0f", asfloat)})
    puts(1, "\n")
    test_equal("grains on square 64", asfloat, actual)   -- PASSES
end procedure
test_square64()

--test_equal("returns the total number of grains on the board", 18_446_744_073_709_551_615, totalgrains())
procedure test_total()
    atom actual = totalgrains()
    atom expected = 18_446_744_073_709_551_615
    print(1, {expected, sprintf("%21.0f", expected)})
    puts(1, "\n")
    test_equal("returns the total number of grains on the board", expected, actual)
end procedure
test_total()
  passed: grains on square 62
  passed: grains on square 63
  failed: grains on square 64, expected: -9.22337203685478e+18 but got: 9.22337203685478e+18
  passed: grains on square 64
  passed: returns the total number of grains on the board
...
{-9.223372037e+18,{32,45,57,50,50,51,51,55,50,48,51,54,56,53,52,55,55,53,56,48,56}}
{9.223372037e+18,{32,32,57,50,50,51,51,55,50,48,51,54,56,53,52,55,55,53,56,48,56}}
{1.844674407e+19,{32,49,56,52,52,54,55,52,52,48,55,51,55,48,57,53,53,49,54,49,53}}

observations:

glennj commented 4 months ago

Gents, I continue to believe that some documentation about euphoria numbers would be beneficial for students of this exercise. I am unqualified to write it. Does anyone else want to give it a try?

petelomax commented 4 months ago

Perhaps we should just be a bit more lenient. How does this sound:

Numbers in Euphoria

There are two data types to hold numbers, integer and atom. Note that integers are signed and one bit smaller than the machine word, hence too small for this task. Before complaining, recall that JavaScript has no integers at all and does everything in floats(/atoms). Atoms can hold larger integers, with the needed 64 bits of precision for this task under 64-bit, however under 32-bit (and if you're still running that, what kind of dinosaur are you?!) they are limited to 53 bits of precision (same as JavaScript, btw).

There is (spoiler alert) also an outstanding issue in the current release of Euphoria whereby (even on 64 bit) power(2,63) != 9223372036854775808. They are also not equal under 32-bit for a quite different reason: while there may be a way (one day) to improve power() on 64-bit, on 32-bit the parser rounds in decimal but the FPU rounds in binary, and since both sides on that platform are wrong anyway, absolutely not worth even trying to fix. Besides, asking for or expecting absolute perfection at, near, or past the limits of precision is and always will be a recipe for disaster.

To get round both these issues, the current test runner accepts and ignores small errors in the results, as long as they are accurate to ten decimal places - rather than compare the numbers themselves directly for equality, it compares their "%g"-printed representations.

Also note the official docs for Euphoria still state the 32-bit limits in several places.

glennj commented 4 months ago

To get round both these issues, the current test runner accepts and ignores small errors in the results, as long as they are accurate to ten decimal places - rather than compare the numbers themselves directly for equality, it compares their "%g"-printed representations.

Where does it do that?

glennj commented 4 months ago

A minor rewrite.

raw markdown (let's see if github renders it)

# Numbers in Euphoria

There are two data types to hold numbers, integer and atom.
Integers are signed and are _one bit smaller than the machine word_, hence too small for this task.
Atoms can hold larger integers, with the needed 64 bits of precision for this task.

<details><summary>
There is (spoiler alert) also an outstanding issue in the current release of Euphoria whereby 
<code>power(2,63)</code> is <strong>not equal</strong> to the <em>integer</em> <code>9223372036854775808</code>.
</summary>

It is, however, equal to the _float_ `9223372036854775808.0`.
</details>

## 32-bit machines

If you are using a 32-bit computer, then numbers are limited to 53 bits of precision (same as JavaScript).

Additionally, `power(2,63)` is not equal to `9223372036854775808` for a quite different reason: while there may be a way (one day) to improve `power()` on 64-bit, on 32-bit the parser rounds in decimal but the FPU rounds in binary. Since both sides on that platform are wrong anyway it will never be fixed.

The official docs for Euphoria still state, in several places, number limits for the 32-bit situation.

Rendered:


Numbers in Euphoria

There are two data types to hold numbers, integer and atom. Integers are signed and are one bit smaller than the machine word, hence too small for this task. Atoms can hold larger integers, with the needed 64 bits of precision for this task.

There is (spoiler alert) also an outstanding issue in the current release of Euphoria whereby power(2,63) is not equal to the integer 9223372036854775808. It is, however, equal to the _float_ `9223372036854775808.0`.

32-bit machines

If you are using a 32-bit computer, then numbers are limited to 53 bits of precision (same as JavaScript).

Additionally, power(2,63) is not equal to 9223372036854775808 for a quite different reason: while there may be a way (one day) to improve power() on 64-bit, on 32-bit the parser rounds in decimal but the FPU rounds in binary. Since both sides on that platform are wrong anyway it will never be fixed.

The official docs for Euphoria still state, in several places, number limits for the 32-bit situation.

petelomax commented 4 months ago

I was just thinking of

function g(atom a) return sprintf("%g",a) end function
--test_equal("grains on square 64",square(64),9223372036854775808)
test_equal("grains on square 64",g(square(64)),g(9223372036854775808))

But only if you really need to. The other approach (and also only when/where needed) is this:

atom r = square(64)
if r=9223372036854775808 then
    -- parser glitch, technically wrong but let it pass.
    test_equal("grains on square 64",r,9223372036854775808)
else
    test_equal("grains on square 64",r,9223372036854775808.0)
end if

Or some variant that both works and covers all the sticky edge cases we've run into (ie, that's completely untested).

I'm not seeing that details/summary adds any value, btw. Why hide(by default) a possible fix? Plus we should mention (assuming one of the above covers it) that the tests are being lenient about that.

It is in fact 32-bit versions of Euphoria, which still run happily on 64-bit machines, rather than 32-bit machines. Perhaps we could just say: Note that 32-bit versions of Euphoria are wholly inadequate and in no way supported for this specific task. (Such still exist only so that some legacy programs can still interface to 32-bit libs that still have no 64-bit version.)

axtens commented 4 months ago

So are two finished with this discussion? If so, what's the final form?

petelomax commented 4 months ago

I'll accept whatever someone else decides (Phix will be a little different - 32 bit is vital to that because of JavaScript/p2js). To my mind, there are two separate questions that need answering: 1) Should the power(2,63) glitch be stuffed in everyone's faces or should the tests quietly gloss over that? [I favour the latter] 2) What is the minimum maths-info needed on the exercise, that is currently missing from the official docs? [and is it right]

Of course neither of those really apply in the same way to Phix, so my head's not in the right place to progress this anyway.

FYI, I've just spotted this in Go (in which uint64s cope with this exercise anyway, so this probably amounts to nothing):

package main import ("fmt"; "math")

func main() { var x float64 = math.Pow(2, float64(63)) fmt.Printf("Type: %T, value: %d\n", x, uint64(x)) fmt.Printf("Type: %T, value: %d\n", x, uint64(x-512)) fmt.Printf("Type: %T, value: %d\n", x, uint64(x-513)) fmt.Printf("Type: %T, value: %d\n", x, uint64(x-1000)) } Type: float64, value: 9223372036854775808 Type: float64, value: 9223372036854775808 Type: float64, value: 9223372036854774784 Type: float64, value: 9223372036854774784

So it is using a) a float64 with only 53 bits of precision and b) some special magic sauce in Printf, the latter of which I've just added to 32-bit Phix (or more accurately tweaked some already working but 64-bit-biased code), but that's not going to help Euphoria one iota. So technically, and ignoring the uint64 sidestep, the same or at least a fairly similar issue is actually present in at least one other mainstream 64-bit programming language. Just so you know.

axtens commented 4 months ago

I'm for glossing over in the first instance. But I'm way out of my depth with the other stuff. I've written to @ghaberek for comment. His work has thrown him a curveball (I think that's the correct US idiom for it).

ghaberek commented 4 months ago

Glossing over this conversation a bit trying to catch up. (I also realized I can participate in these conversations via GitHub app on my phone, which helps.)

I would say the best approach in the context of teaching Euphoria is that there are only two data types: atom for numbers and sequence for arrays. The integer data type should be described as "specialized" and is typically used at intermediate and higher levels to further optimize one's code. The implementation details do not matter to the student, nor should they be bothered with having to consider when they might-or-might-not need to use integer versus atom. They might need to know that since atom can contain decimals, it may be necessary to floor() values to maintain whole numbers. But that is a math problem in general, not a specific Euphoria or data type problem.

petelomax commented 3 months ago

OK, its, time. Can we just pull whatever Glenn's got, and I'll throw a PR at it (which you'll be free to reject) and we'll go from there.

PS I ended up with something quite different for Phix, and made it all work on 32-bit.

glennj commented 3 months ago

Resurrected in #114