leanprover / lean4

Lean 4 programming language and theorem prover
https://lean-lang.org
Apache License 2.0
4.64k stars 418 forks source link

Nat.repr slow for very large literals #5771

Open nomeata opened 1 week ago

nomeata commented 1 week ago

When experimenting with encoding set as bitvectors as Nats, which seems like it should be rather efficient, I noticed that processing the file in VSCode was quick enough, but lake build would take a long time, and lean is busy writing the .c file.

I suspect that Nat.repr is just very slow on larger literals, given that it goes through List Char, rather than allocating a String of the right size and then (linearly) updating it digit by digit (or chopping the Nat into USize-sized limbs, using the C code to print it, and concatenating efficiently).

It also seems to scale quadratically with the length of the number, as

#time #guard_msgs(drop all) in #eval (Nat.repr (10^50000)).length
#time #guard_msgs(drop all) in #eval (Nat.repr (10^100000)).length
#time #guard_msgs(drop all) in #eval (Nat.repr (10^200000)).length
#time #guard_msgs(drop all) in #eval (Nat.repr (10^400000)).length

shows, which gives me (on live.lean-lang.org) timings

time: 622ms
time: 2374ms
time: 9302ms
time: 37342ms

Versions

"4.12.0-nightly-2024-10-18"

Impact

Add :+1: to issues you consider important. If others are impacted by this issue, please ask them to add :+1: to it.

digama0 commented 1 week ago

Base conversion is algotihmically quadratic. (Every digit requires doing what amounts to (n % 10, n / 10) and there are not many shortcuts.) I don't think the List Char thing (which is linear) is what you are witnessing. As long as you only do base conversion up to bounded length (by breaking it into limbs of some size), this should be much faster.

nomeata commented 1 week ago

Ok, but presumably there are still some large constant factors on the table here?

digama0 commented 1 week ago

Yes, possibly, I'm just saying that these numbers don't demonstrate that. If large number literals are transitioned to use a from_array function instead of translating a string, the calls to Nat.repr would disappear except for small numbers, where I think they are not a bottleneck.