pietroppeter commented 4 years ago

This was a bit tricky. Fixes #27 and problem with self multiplication.

Essentially the issue was that for a generic operation op(a: var BigInt, b: BigInt, c: BigInt) when calling setXLen on a.limbs if b is the same as a then b.limbs gets reset to empty inside op (and c is only involved for computing resetting length). For a minimal example that shows this (playground):

import sugar

var
  a = @[0]

proc op(a: var seq[int], b: seq[int]) =
  dump a.repr
  dump b.repr
  a.setLen(2)
  dump a.repr
  dump b.repr

op(a, a)

output:

a.repr = 0x7ff35fdfd050@[0]

b.repr = 0x7ff35fdfd050@[0]

a.repr = 0x7ff35fdfd1a0@[0, 0]

b.repr = 0x7ff35fdfd050@[]

Note that the "reset" is not always the case, in particular issue #27 if added to the tester.nim file did not raise an error, while it raises an error when in a separate file (tissue27.nim, added to tests). This is possibly due to the fact that when resetting length, in the heap there is space for lengthening the sequence without copying (probably the sequence was allocated in a space previously occupied by a longer sequence).

The fix I propose is to change behaviour of SetXLen to always create a new sequence and copying the content according to new length.

unrelated change: added .exe to .gitignore.

pietroppeter commented 4 years ago

After review, open points:

[x] add switch for old behaviour
[x] benchmark old vs new behaviour
[ ] ~~other options for the fix?~~
[x] revise the version bump (minor increment?)

regarding other options I was thinking of having a check inside functions where SetXLen might cause issue. The check should check if we are using the same input and probably the best way would be to check if limbs have same addr. Need to think this through anyway.

pietroppeter commented 3 years ago

finally added the benchmarking against old setXLen. I am a bit disappointed that chainedAddition task almost doubles its time, so I am now thinking that I should consider other options for the fix as outlined in last comment.

For other comments regarding benchmark results: simple addition and multiplication are not impacted, which is kind of expected. I was surprised that chainedMultiplication is not impacted as much as chainedAddition: I guess this is because multiplication even with old setXLen needs to copy the sequence since the memory allocated is not sufficient to extend. In principle the optimal way to deal with var that grow is to preallocate a seq with enough capacity (if possible). I guess this is doable with current API (there is an init that can get as parameter the seq of limbs and this can be created with a preset capacity).

Below the automatic benchmarking report (available also here). A few notes on benchmark implementation:

the idea is to add full benchmarking to the repo (as a separate PR, I will clean up this one after it is completed)
benchmark is based on https://github.com/disruptek/criterion and is run again (including report generation) running nimble bench
regarding the strange labels for parameters (100d means using 2 100 digits numbers), this is partly due to a limitation of criterion in handling parameters. anyway looking at benchmark.nim should clarify what is being used.
idea is to implement all benchmark from https://peterolson.github.io/BigInteger.js/benchmark/
and also add other benchmarks such as those called chainedAddition and chainedMultiplication found in bigints source code (commented)
ideally the benchmarking in the master there should be between this library and bignum.

Benchmark results

addition

100d

	cpuTimes	CI(lower)	CI(upper)
benchmark.json	135.6	130.6	143.0
benchmark_oldSetXLen.json	139.4	133.6	147.9

chainedAddition

100d,3i

	cpuTimes	CI(lower)	CI(upper)
benchmark.json	907.7	883.0	933.5
benchmark_oldSetXLen.json	574.2	558.4	593.8

100d,33i

	cpuTimes	CI(lower)	CI(upper)
benchmark.json	7614.7	7535.1	7696.5
benchmark_oldSetXLen.json	4147.9	4098.1	4203.3

multiplication

100d

	cpuTimes	CI(lower)	CI(upper)
benchmark.json	460.8	449.5	479.0
benchmark_oldSetXLen.json	456.4	445.0	473.9

chainedMultiplication

10d,3i

	cpuTimes	CI(lower)	CI(upper)
benchmark.json	22098.0	21961.3	22266.7
benchmark_oldSetXLen.json	21685.0	21591.8	21807.4

pietroppeter commented 3 years ago

Sorry for leaving this open for such a long time... I think the easiest option would be actually merging the fix (bumping the version) and maybe opening an issue to track the possibility of further improvement. In the next days I plan to clean up this towards this goal.

pietroppeter commented 3 years ago

cleaned up (removed also the benchmarks, if this branch is not deleted the code should be still available in this commit) and ready to be merged.

also created the issue for tracking performance regression: https://github.com/def-/nim-bigints/issues/36

After merging remember to tag the release both of 0.5.0 (latest commit after merge) and of 0.4.4 (latest commit before merge), so that nimble can pick up requirements <= 0.4.4 in case someone wants to use it.

def- commented 3 years ago

Thanks, will tag them.

nim-lang / bigints

Fix operations with self #35

Benchmark results

addition

100d

chainedAddition

100d,3i

100d,33i

multiplication

100d

chainedMultiplication

10d,3i