Closed PeakKS closed 10 months ago
This should likely go in a RAII container like dvander mentioned on the previous PR - that way the resource are properly released on unload.
This should likely go in a RAII container like dvander mentioned on the previous PR - that way the resource are properly released on unload.
I'm no C++ whiz but I think this is the right way to do that:
static std::unique_ptr<FILE, int(*)(FILE*)> pMasterFD(fopen("/proc/self/maps", "r"), fclose);
std::unique_ptr<FILE, int(*)(FILE*)> pF(fdopen(dup(fileno(pMasterFD.get())), "r"), fclose);
That works though it is slightly slower.
Elapsed CPU Time (CURRENT) = '4.22251' sec.
Elapsed CPU Time per Iteration (CURRENT, 500000) = '8.445022e-06' sec.
Elapsed CPU Time (DUP) = '0.289283' sec.
Elapsed CPU Time per Iteration (DUP, 500000) = '5.785660e-07' sec.
Elapsed CPU Time (UNIQUE) = '0.32418' sec.
Elapsed CPU Time per Iteration (UNIQUE, 500000) = '6.483600e-07' sec.
If only the static one is made unique_ptr the time penalty is very small.
Elapsed CPU Time (CURRENT) = '4.26076' sec.
Elapsed CPU Time per Iteration (CURRENT, 500000) = '8.521512e-06' sec.
Elapsed CPU Time (DUP) = '0.28382' sec.
Elapsed CPU Time per Iteration (DUP, 500000) = '5.676400e-07' sec.
Elapsed CPU Time (UNIQUE) = '0.289645' sec.
Elapsed CPU Time per Iteration (UNIQUE, 500000) = '5.792900e-07' sec.
I'm thinking leave the working pointer as normal but change the static to unique_ptr so it is sure to be cleaned properly.
I almost wonder if we want to wait and see if the c++17 change gets merged, and then use std::filesystem ?
I almost wonder if we want to wait and see if the c++17 change gets merged, and then use std::filesystem ?
Plus, i think it'd probably also be better to do this in a ctor instead of statically in SH
Did a bit more testing but it seems the best impl is still making the static master a unique_ptr and duping it. Should be all good now.
Nevermind with further testing I have found out that the position is inherent to the file descriptor and so dup does not solve this. A thread_local unique_ptr with fseek is a little faster at least...
Elapsed CPU Time (CURRENT) = '4.53125' sec.
Elapsed CPU Time per Iteration (CURRENT, 500000) = '9.062500e-06' sec.
Elapsed CPU Time (THREAD) = '3.86515' sec.
Elapsed CPU Time per Iteration (THREAD, 500000) = '7.730302e-06' sec.
But I'd like to find something better.
Closing due to the initial premise being broken. There’s some other ideas for optimization but they should be their own PRs.
This is another significant speedup to GetPageBits. By only opening the file once statically, and just dup-ing it to a local working file pointer it should stay safe while removing most fopen overhead. Benchmarks: