FoldingAtHome / fah-client-bastet

Folding@home client, code named Bastet
GNU General Public License v3.0
51 stars 9 forks source link

Add option to control checkpoint rate #247

Open Wazzzzuzp opened 1 month ago

Wazzzzuzp commented 1 month ago

Hi Devs,

Any chance of getting GPU Checkpoint Option added to the cores? Either to turn it off or extend it to x amount of frames?

With my 4090 folding time per frame is normally in the 9 - 30 secs range. Having the core stop to perform a check point that takes as long as a frame to complete seems waste full.

Thanks image

muziqaz commented 1 month ago

This is set by project owners, not the client. 4090 or other ultra high end users are extreme minority, and it is not very helpful to the projects to cater to that minority :) Reducing the frequency of checkpoints would hinder science progress, because those with slower cards would lose a lot of progress if they paused or switched off folding just before it writes a checkpoint. remember, your 9 seconds TPF is equal to minutes or tens of minutes to other performance tier cards. Checkpoints are not time based, too. so 5% for you is 45s, for others it might be half an hour, and they start losing half an hour of science, we are going backwards

Wazzzzuzp commented 1 month ago

So no different then folding on a cpu, however the cpu units can set how often they checkpoint. Was after the same for the GPUs.

muziqaz commented 1 month ago

CPU checkpointing is different, and is very fine grained. Regardless of what you set for yourself, CPU will always restart very close to where you paused/stopped. OpenMM (GPU), I believe, does not have such functionality. Whatever project owner (not fah dev) sets it in their project, it is universal and cannot be changed per client basis, at least with current client functionality. I believe if this was possible, this would have been implemented in v7 long time ago.

Wazzzzuzp commented 1 month ago

Better course of action is to go add a feature request to OpenMM?

muziqaz commented 1 month ago

No, better course of action is to be patient ;) interesting to hear what Joe has to say, and I also asked the question to GPU fahcore devs. Will see what comes out of it ;)

jcoffland commented 1 month ago

I don't think there's a problem here. The checkpointing is taking less than one second. This is apparent from the log.

muziqaz commented 1 month ago

We had a little chat internally and OpenMM checkpoint frequency is set by researcher at the beginning of the project, and is written into core.xml file. Not sure it would be ideal to have an option for users to alter those settings

jcoffland commented 1 month ago

The benefit to cost, in terms of performance gained vs development effort, is not good.

Wazzzzuzp commented 1 month ago

I don't think there's a problem here. The checkpointing is taking less than one second. This is apparent from the log.

Not the best example as this machine is running FAHClient from RAM.

Thanks all for entertain the idea.

muziqaz commented 1 month ago

That would still need to be exposed within fahclient for you to do the change

Wazzzzuzp commented 1 month ago

That would still need to be exposed within fahclient for you to do the change

Sorry I dont follow

muziqaz commented 1 month ago

Fahclient is the UI for users to adjust the settings. If fahclient does not have an actual option to change checkpoint frequency, you won't be able to change it ;)