[Server] Scheduler: estimate GPU app needs better

RichardHaselgrove commented 5 years ago

Describe the problem It's now 2019, and that marks 10 years since the introduction of GPU computing in BOINC. (Technically, the launch was at the end of 2008, but the first fully-working GPU application - v6.08 for SETI@Home - wasn't available until 15 January 2009). Since then, we've grown to accept different makes of GPUs and different programming languages, and the highest speed GPUs are orders of magnitude faster than the early devices. But I think we're still using some of the early assumptions about their capabilities.

In particular, different GPU types, and different programming languages, place different demands on the host computer. A visual example:

cpu efficiency

NumberFields is a CPU-only application, running at typically 97% CPU usage Einstein is an OpenCL application for Intel_gpu, requiring less than 3% CPU support GPUGrid is a CUDA application for NVidia GPU, requiring over 30% CPU SETI is an OpenCL application for NVidia, requiring over 97% CPU - as much as the pure CPU apps

But we don't even attempt to track and measure those CPU support requirements. Code like https://github.com/BOINC/boinc/blob/0cc45a13950aad6181320b870d4e257f96fb4aec/sched/sched_customize.cpp#L505 still relies on assumptions about the relative speeds of CPU and GPU devices, and the proportion of the work to be done on each device. As speeds have diverged, these assumptions have become less and less realistic. The medium-use GPUGrid case above emerges from the server as

    <plan_class>cuda80</plan_class>
    <avg_ncpus>0.975476</avg_ncpus>
    <max_ncpus>0.975476</max_ncpus>
    <flops>131171048789.023773</flops>
    <coproc>
        <type>CUDA</type>
        <count>1.000000</count>
    </coproc>

demanding as much CPU reservation as the worst-case OpenCL application.

Describe the solution you'd like Track and use actual measurements of application performance on individual computers.

We already track equivalent metrics in the database tables app_version and host_app_version. And we receive reports of elapsed time and CPU time with every completed task. All we need to do is to store, update, and use the running average ratio of those values when issuing new work. The existing system needs to be kept as a fallback for edge cases like newly attached hosts, new app versions, and non-standard clients which don't report all current fields.

Additional context This would be particularly useful at SETI@Home, where separate CUDA and OpenCL app_versions are available, each capable of processing the same tasks on the same hardware. At the moment, all app_versions are allocated the same value, although their needs are very different.

In particular, the OpenCL app for NVidia performs poorly unless a full CPU core is kept clear of CPU apps while it is running: the client doesn't do this even for fractional values as high as 97% - and nor should it. Instead, if the actual measured average recorded on the server is over, say, 95% (precise value tbc), it should be rounded up to 1.00 to give the appropriate direction to current clients not to schedule that last CPU core.

KeithMyers commented 5 years ago

Wanted to also point out for GPUGrid, cpu usage is 100% for gpu tasks if the user has implemented SWAN_SYNC in the environment. Most of the GPUGrid heavy hitters are running with that parameter set to speed up computation. So that would be encompassing two kinds of GPUGrid users, the stock, set nothing type and the max performance enthusiasts.

RichardHaselgrove commented 5 years ago

This issue is intended to benefit the majority of users who simply accept BOINC work as it comes, and don't get involved in manual tuning. If a user chooses to set an environment variable and succeeds (many don't, or need help understanding the exact format required), then they can also set up an app_config to match - and any volunteer who suggests one should suggest both.

Mind you, if setting SWAN_SYNC results in 95%+ CPU recorded time, then this proposal would automatically result in a full CPU reservation for that host only, provided the server update was backported to GPUGrid: they lack experienced BOINC support in their server ops area, and are reluctant to upgrade beyond their current server v6.13

BOINC / boinc

[Server] Scheduler: estimate GPU app needs better #2949