alliedmodders / metamod-source

Metamod:Source - C++ Plugin Environment and Detour Library for the Source Engine
http://www.metamodsource.net/
Other
386 stars 88 forks source link

Duplicate procfs file pointer instead of reopening #130

Closed PeakKS closed 10 months ago

PeakKS commented 1 year ago

This is another significant speedup to GetPageBits. By only opening the file once statically, and just dup-ing it to a local working file pointer it should stay safe while removing most fopen overhead. Benchmarks:

Elapsed CPU Time (CURRENT) = '4.29449' sec.
Elapsed CPU Time per Iteration (CURRENT, 500000) = '8.588978e-06' sec.
Elapsed CPU Time (DUP) = '0.291668' sec.
Elapsed CPU Time per Iteration (DUP, 500000) = '5.833360e-07' sec.
KyleSanderson commented 1 year ago

This should likely go in a RAII container like dvander mentioned on the previous PR - that way the resource are properly released on unload.

PeakKS commented 1 year ago

This should likely go in a RAII container like dvander mentioned on the previous PR - that way the resource are properly released on unload.

I'm no C++ whiz but I think this is the right way to do that:

static std::unique_ptr<FILE, int(*)(FILE*)> pMasterFD(fopen("/proc/self/maps", "r"), fclose);
std::unique_ptr<FILE, int(*)(FILE*)> pF(fdopen(dup(fileno(pMasterFD.get())), "r"), fclose);

That works though it is slightly slower.

Elapsed CPU Time (CURRENT) = '4.22251' sec.
Elapsed CPU Time per Iteration (CURRENT, 500000) = '8.445022e-06' sec.
Elapsed CPU Time (DUP) = '0.289283' sec.
Elapsed CPU Time per Iteration (DUP, 500000) = '5.785660e-07' sec.
Elapsed CPU Time (UNIQUE) = '0.32418' sec.
Elapsed CPU Time per Iteration (UNIQUE, 500000) = '6.483600e-07' sec.

If only the static one is made unique_ptr the time penalty is very small.

Elapsed CPU Time (CURRENT) = '4.26076' sec.
Elapsed CPU Time per Iteration (CURRENT, 500000) = '8.521512e-06' sec.
Elapsed CPU Time (DUP) = '0.28382' sec.
Elapsed CPU Time per Iteration (DUP, 500000) = '5.676400e-07' sec.
Elapsed CPU Time (UNIQUE) = '0.289645' sec.
Elapsed CPU Time per Iteration (UNIQUE, 500000) = '5.792900e-07' sec.

I'm thinking leave the working pointer as normal but change the static to unique_ptr so it is sure to be cleaned properly.

sapphonie commented 1 year ago

I almost wonder if we want to wait and see if the c++17 change gets merged, and then use std::filesystem ?

sapphonie commented 1 year ago

I almost wonder if we want to wait and see if the c++17 change gets merged, and then use std::filesystem ?

Plus, i think it'd probably also be better to do this in a ctor instead of statically in SH

PeakKS commented 1 year ago

Did a bit more testing but it seems the best impl is still making the static master a unique_ptr and duping it. Should be all good now.

PeakKS commented 1 year ago

Nevermind with further testing I have found out that the position is inherent to the file descriptor and so dup does not solve this. A thread_local unique_ptr with fseek is a little faster at least...

Elapsed CPU Time (CURRENT) = '4.53125' sec.
Elapsed CPU Time per Iteration (CURRENT, 500000) = '9.062500e-06' sec.
Elapsed CPU Time (THREAD) = '3.86515' sec.
Elapsed CPU Time per Iteration (THREAD, 500000) = '7.730302e-06' sec.

But I'd like to find something better.

PeakKS commented 10 months ago

Closing due to the initial premise being broken. There’s some other ideas for optimization but they should be their own PRs.