Open jkr0103 opened 1 year ago
Memory allocations are expected especially during generation of synthetic datasets. I can't add anything else without knowledge of what are you exactly running on this screenshot. There is no variable or argument to control it.
Is it possible to split the "generation of synthetic datasets" and "actual benchmark execution" between two processes. My case is I am trying to run these benchmarking algorithms in SGX using gramine where we have memory constraints. Hence would like to know if synthetic datasets can be generated separately so that we do only benchmarks execution inside SGX.
sorry closed it by mistake
Would be addressed with pre-fetch capability in this PR -https://github.com/IntelPython/scikit-learn_bench/pull/133
I captured perf data for most of the algorithms and see there are lot many memory allocations happens during the run which become bottleneck. Please refer attached screenshot.
Is there a way to fine tune the memory allocations? like any env variable or cmmandline arguments?