Hi, I am submitting a patch that is the result of a memory problem I am investigating.
ATM I cannot share the code that is having the problem, but I could isolate it in a separate program:
package main
import (
"flag"
"fmt"
"math/big"
"os"
"runtime"
"runtime/pprof"
"time"
"go.starlark.net/starlark"
)
var asd starlark.Float
var (
testcase = flag.Int("testcase", 0, "testcase")
)
func main() {
flag.Parse()
f, _ := os.Create("profile")
result := make([]starlark.Value, 0, 1024*1024*10)
defer func() {
runtime.GC()
pprof.Lookup("heap").WriteTo(f, 0)
f.Close()
runtime.KeepAlive(result)
}()
var integer starlark.Int
fmt.Printf("Testcase %d\n", *testcase)
switch *testcase {
case 0:
// Small integer
integer = starlark.MakeInt64(1 << 16)
case 1:
// Big integer (>100 bits)
integer = starlark.MakeBigInt(new(big.Int).MulRange(2, 30))
default:
// int64
integer = starlark.MakeInt64(1 << 32)
}
for i := 0; i < 1024*1024*10; i++ {
value, _ := starlark.Call(&starlark.Thread{}, starlark.Universe["float"], starlark.Tuple{integer}, nil)
result = append(result, value)
}
}
This code explores 3 use-cases that are representative of the float builtin function.
In the first case, it will transform a small integer (less than 32 bits) to float, in the second it will transform a big one (~100 bits) while the third will use a 64-bit number above 32 bits.
I am running and obtaining the result through go pprof toool like go run . -testcase <n> && go tool pprof -top profile && rm profile (from the cmd/starlark directory).
Below the results showing the problem.
Case 0:
Type: inuse_space
Time: Nov 23, 2022 at 2:01pm (CET)
Showing nodes accounting for 247.50MB, 99.40% of 249MB total
Dropped 18 nodes (cum <= 1.25MB)
flat flat% sum% cum cum%
160MB 64.26% 64.26% 247.50MB 99.40% main.main
87.50MB 35.14% 99.40% 87.50MB 35.14% go.starlark.net/starlark.float
0 0% 99.40% 87.50MB 35.14% go.starlark.net/starlark.(*Builtin).CallInternal
0 0% 99.40% 87.50MB 35.14% go.starlark.net/starlark.Call
0 0% 99.40% 247.50MB 99.40% runtime.main
As expected, it is using some space in the main.main for the vector and around 88 MB for the floats.
Case 1:
Type: inuse_space
Time: Nov 23, 2022 at 2:02pm (CET)
Showing nodes accounting for 242.50MB, 99.38% of 244MB total
Dropped 14 nodes (cum <= 1.22MB)
flat flat% sum% cum cum%
160MB 65.57% 65.57% 242.50MB 99.38% main.main
82.50MB 33.81% 99.38% 82.50MB 33.81% go.starlark.net/starlark.float
0 0% 99.38% 82.50MB 33.81% go.starlark.net/starlark.(*Builtin).CallInternal
0 0% 99.38% 82.50MB 33.81% go.starlark.net/starlark.Call
0 0% 99.38% 242.50MB 99.38% runtime.main
As expected this is taking much more time to perform (from the big.Int) but at the end, the memory is pretty similar to the original one (difference can be accounted to my very rough way of performing measurements and varies between different runs).
This case is the problem I'm facing: 80MB are allocated for the transient big.Float employed in the computation.
My explanation (hopefully right) of the problem is that float builtin is hitting a very specific worst-case scenario for Go's tiny allocator (which has been there for quite a while) in case the big.Int is allocating at most 8 bytes (so in Starlark we are talking of just the case the number is in the [2^32, 2^64) range).
In this very case, float function will allocate first 8 bytes in the tiny chunk (which is 16 bytes). Immediately after it will allocate other 8 bytes in the same chunk (not always as it depends on how much memory left is there in the chunk, but often enough to make it noticeable). The first allocation could be collectable, but since the tiny allocator collects memory only when all the objects in the chunk become unreachable, the entire chunk gets effectively linked to the lifetime of the starlark.Float (as starlark.Value), in practice doubling its "real" size (and keeping alive the nat part of the big.Int)
This code adds a "fast" path in case the big.Int is small enough to fit in 64 bits.
As a side note, I couldn't think of a representative test case/benchmark to show the problem in a normal go test, if you could provide some guidance on that it would be great.
Hi, I am submitting a patch that is the result of a memory problem I am investigating.
ATM I cannot share the code that is having the problem, but I could isolate it in a separate program:
This code explores 3 use-cases that are representative of the
float
builtin function.In the first case, it will transform a small integer (less than 32 bits) to float, in the second it will transform a big one (~100 bits) while the third will use a 64-bit number above 32 bits.
I am running and obtaining the result through go
pprof
toool likego run . -testcase <n> && go tool pprof -top profile && rm profile
(from thecmd/starlark
directory).Below the results showing the problem.
Case 0:
As expected, it is using some space in the
main.main
for the vector and around 88 MB for the floats.Case 1:
As expected this is taking much more time to perform (from the
big.Int
) but at the end, the memory is pretty similar to the original one (difference can be accounted to my very rough way of performing measurements and varies between different runs).Case 2:
This case is the problem I'm facing: 80MB are allocated for the transient
big.Float
employed in the computation.My explanation (hopefully right) of the problem is that
float
builtin is hitting a very specific worst-case scenario for Go's tiny allocator (which has been there for quite a while) in case thebig.Int
is allocating at most 8 bytes (so in Starlark we are talking of just the case the number is in the [2^32, 2^64) range).In this very case,
float
function will allocate first 8 bytes in the tiny chunk (which is 16 bytes). Immediately after it will allocate other 8 bytes in the same chunk (not always as it depends on how much memory left is there in the chunk, but often enough to make it noticeable). The first allocation could be collectable, but since the tiny allocator collects memory only when all the objects in the chunk become unreachable, the entire chunk gets effectively linked to the lifetime of thestarlark.Float
(asstarlark.Value
), in practice doubling its "real" size (and keeping alive thenat
part of thebig.Int
)This code adds a "fast" path in case the
big.Int
is small enough to fit in 64 bits.As a side note, I couldn't think of a representative test case/benchmark to show the problem in a normal go test, if you could provide some guidance on that it would be great.