zhblue / hustoj

Popular Open Source Online Judge based on PHP/C++/MySQL/Linux for ACM/ICPC and NOIP training, with easy installation. 开源OJ系统
http://www.hustoj.com/?cat=2
GNU General Public License v2.0
3.33k stars 773 forks source link

initializing `topmemory` #254

Closed amir-s closed 6 years ago

amir-s commented 6 years ago

Hello!

I'm in the process of developing a custom sandboxing library with javascript bindings and i'm trying to get some of the brilliant ideas from HUSTOJ.

I don't really understand why topmemory gets initialized to the "VmRSS" value here: https://github.com/zhblue/hustoj/blob/afbc318f7475b2a518603e7b4e523e2932dc3094/trunk/core/judge_client/judge_client.cc#L2027-L2028

If the judge process itself consumes a lot of memory, the that initial value could go higher than the memory limit and result in "Memory Limit Exceeded". Isn't it safe to just leave it be 0 initially and use "VmPeak" to get the memory usage?

Thanks a lot.

zhblue commented 6 years ago

it's a long time after those code been written, I'm not sure about the situation you mentioned. I recalled that VmPeak give a Peak of VmSize which could be much larger than VmRSS. Because Linux has this COW [https://en.wikipedia.org/wiki/Copy-on-write] mechanism, code like char * buf=malloc(100*1024*1024*sizeof(char)); can cause VmSize and VmPeak go high, while VmRSS stay low . Many ACM/ICPC participants like to malloc a large VmSize while use only part of it . it's more convenient to use VmRSS than explain why their code get MLE [ : P ]

amir-s commented 6 years ago

can cause VmSize and VmPeak go high, while VmRSS stay low . Many ACM/ICPC participants like to malloc a large VmSize while use only part of it . it's more convenient to use VmRSS than explain why their code get MLE [ : P ]

But the code does both. I mean it initializes with VmRSS, but gets the max value from VmPeak. If VmPeak is the max value, should we always stick to that instead of VmRSS? Or if we want to be smart and only consider a "used" memory as "consumed", should we always stick to VmRSS and forget about VmPeak?

The problem however could be something else here! When we fork() the process to run the solution in the child, by the time the parent reads the "VmRSS" value, the child could still be preparing stuff for doing the exec, so it is still running using the same memory as the parent process. I could be wrong, but I think the parent process should read "VmRSS" only after we know for sure the child has execed the solution.

What can go wrong if we put this if https://github.com/zhblue/hustoj/blob/afbc318f7475b2a518603e7b4e523e2932dc3094/trunk/core/judge_client/judge_client.cc#L2027-L2028

after the wait4 in the while? https://github.com/zhblue/hustoj/blob/afbc318f7475b2a518603e7b4e523e2932dc3094/trunk/core/judge_client/judge_client.cc#L2029-L2032

that way we can make sure when we are reading VmRSS, it belongs to the solution.

zhblue commented 6 years ago

You are right, I'll make this adjust after a few tests on by test server

zhblue commented 6 years ago

it appears that these two lines are not needed at all. i removed them Thank you !

amir-s commented 6 years ago

Great! Yeah I think VmPeak would do fine and there's no need to read the VmRSS!

I'm still doing some testing on my part as well. I'll let you know if I find anything interesting!

Thanks a lot :)