FoldingAtHome / fah-issues

49 stars 9 forks source link

Add an option to disable a GPU or the CPU for FAH. #1605

Open bb30994 opened 3 years ago

bb30994 commented 3 years ago

Your issue may already be reported! Please search on the issue tracker before creating one.

Is your feature request related to a problem?

Low end systems receive unnecessary loss of CPU threads. Every GPU demands a CPU thread, even if I have no desire to fold with that GPU. This happens with Intel iGPs and with low-end GPUs which add unnecessary heat to some laptops but also on other systems. e.g. 1) The iGP reduces the CPU thread count by 1, even if the results produced by the GPU are only slightly higher. 2) Some systems may be ideal for CPU folding and may overheat with a GT 730m which produces negative creditable results if the system folds <10 hrs per day. 3) A Core2Duo with two low-end GPUs (e.g.- CUDA supported GT710s. cannot be configured because the CPU:1 slot is configured first and there's no tread to support the second GPU. In each case, I can manually create a configuration for my system WITHOUT the GPU or without the CPU slot than if it is forced upon me. ---- ## Describe the Feature Create a DISABLE setting which can remove a GPU or a CPU from consideration by the FAHClient set-up process. With an assortment of weak features, I should be able to reallocate the weakest features and reallocate stronger features. Recently a slot for the Intel IGP has been created even if the GPU was not supported. FoldingForum suggested that it could be configured to PAUSE-ON-START as a method of disabling it, but it still reduces the CPU count by one thread. System with two GT710s (CUDA supported) and a Core2Duo would be more useful I I could configure it with 2 GPU slots. ---- ## Context
Sandman192 commented 2 years ago

Removing a CPU or GPU slot use to work. But if you do, you can't restart you client, or it will add that slot right back in.

They say it's a feature. I call it resetting you setting without letting you ever know until you check your client.

xenek commented 1 year ago

I'd also like this feature. When using a laptop, often you don't use any GPUs it may have. Yet the combined cooling features - heatpipes that share fans or are connected to both pieces of silicon mean that CPU and GPU folding significantly add to heat production.

Background: Gaming laptops are far more common today, and many business computers come with dedicated GPUs that are underutilized. Also, when using mobile computers, and no longer owning desktops, people are not likely to even try using any software that can overload a laptop or notebook.

It's very useful to me, to be able to stop CPU folding, but leave GPU folding occuring using the two GPUs, so that the CPU is entirely free for other tasks - or even, other distributed computing projects.

So, it's really handy if that feature is a 'user toggle' because it means I can manage the laptop usage state trivially, making it far more dual-purpose, and meaning I can self-manage the folding which is essential when I'm often sleeping it, hibernating it, traveling with it, etc.

Once setup somewhere for a while on AC power, you can turn on folding on the GPUs, when not in use overnight you can turn on folding on CPU and GPUs, but when in use during the day and being transported and run on battery as you travel with it, you want both CPU and GPU folding paused, without any auto-renabling if you happen to have to restart the notebook or laptop or shut it down for safety reasons before putting it in a laptop bag. I see this as a safety risk matter as well, where the client is risky if it's booting up to CPU and GPU folding after a crash or if accidentally powered on while in a backpack or briefcase.

You could even have the FAH client ask during installation: Is this a portable computer, a notebook or laptop or ultrabook. If the user answers 'yes' the FAH client defaults to 'never automatically starting on reboot', for legal and safety and fire hazard reasons.

Laptops or notebooks often have two GPUs, one dedicated, one integrated, - as is common on consumer gaming laptops or notebooks, or even higher powered business laptops and notebooks and ultrabooks today.

So having a way to enable technicans and salespeople and anyone who is talking about the use of computing equipment for distributed compute activity, to be completely confident that a user is guided by solid safety and functionality features, is I think, of high value to the FAH program and any other computing that uses the silicon to the maximum potential, generating heat and consuming electricity rapidly. Default to off, being able to only toggle GPU folding on, but being able to toggle on CPU folding, all are great features.

Lastly, at risk of dropping project compute submissions due to submission period expiry dates being exceeded, I think that being able to see the effect of a schedule is important. Eg. I might run FAH on CPU folding for three nights a week, wtf, only when the sound from the fans isn't disturbing people or something. In that imaginary scenario, if I'm only allowing a total of 6 hours a night, the sum of 18 hours of compute each week might not be enough to get the work unit completed. So having a small statistic connected to work units 'already queued' appear, will help me make decisions about how to structure activity around allowing a work unit to be submitted in time.

Eg. I'm part way through a work unit. I configure CPU folding to Off, and configure a reminder schedule for a popup to turn CPU folding to on, where I select wft and 11 pm to 5 am. But a warning number or exclamation appears in the GUI or a popup and shows that the 'notification schedule' I set, doesn't allow the FAH client time to complete the work unit that is 'in queue' or that 'has begun' or where 'the unit data has been downloaded'.

This is a very difficult feature to implement, substantially increasing the complexity of the client. It might be that the client support isn't really ideal for such coding. So perhaps a wrapper, or independent scheduling app that does the reminders, be suggested, that uses an API connection that reads the FAH client data but never writes to the FAH client, and instead simply pops up to make useful suggestions, on how to manually manage the work units, so that they don't expire, creating delays to projects or wasting compute time that isn't contributed due to it being too late.

If anyone has any ideas on the type of auxillary software or timer client that could read the FAH data to augument the existing client, that are found to work, that would be great, as it means the FAH developers could perhaps get away with only adding a simple 'disable CPU folding' option.

Lastly - one of the GPUs is on-socket or on-die using the same heatpipe and fan as the CPU, so sometimes I might want to run only the integrated GPU, or only the dedicated GPU, so that whole 'full control over the individual silicon chips' is really important to me, to help balance the situation I have at any time. I may want to use the dedicated GPU for a work-related task, along with the CPU, but be happy to let the FAH client run folding on the integrated GPU, as it's essentially idle or underutilized.

All this is important as it relates to usability and heat production on what are devices that require early servicing of blocked air vents and that are portable and often are heavily used as they are not racked or left at home while someone is at work. Etc.