FoldingAtHome / fah-issues

49 stars 9 forks source link

Better Management of GPUs.txt File #1504

Open PantherX opened 4 years ago

PantherX commented 4 years ago

Is your feature request related to a problem?

The problem is that the GPUs.txt file is used to make assignment to GPUs. It was a good system about 10 years ago when GPUs were easier to manage. Fast forward now, GPUs are more complex to classify and the variation within a single Generation (GPU architecture) is rather significant making it hard for effective WU allocation to GPUs. This is not ideal for the researchers who have varying needs for their projects and donors who have varying needs of their hardware.


Describe the Feature

Rather than the Servers using using GPUs.txt file, we instead use GPU.JSON file which can have additional key values to help make a smarter decision and it can be easier to update/maintain/classify the GPUs as new models are released. Potential layout of the JSON file would be:

Vendor: Nvidia
VendorID: 10DE
Name: GTX 1080 Ti
DeviceID: 1B06
Status: Supported
Architecture: Pascal
Chip: GP102
VRAM: 11
OpenCL: 1.2
CUDA: 6.1
FP32: 11,340
FP64: 354

The client can be modified to handle the GPUs.JSON file and it will ensure that the SHA256 checksum matches what the Server has. This ensures that:

  1. Any local changes will be over-ridden by the single source of truth. No local changes will persist preventing unexpected behaviors on the client-side.
  2. Rather than downloading a GPUs.txt file every 30 days or manually forcing it to download, it will only download when changes are done.
  3. Clear presentation of information to make informed decisions without compromises.

Context

The GPUs.JSON will allow a much better experience for the donor and the researcher as it allows fine gain controls of what GPUs to target as we are using the actual data that matters to researchers for example: Assign Project X to GPUs where VendorID == 10DE OR Vendor ID == 1002 && (VRAM is > 5 && OpenCL == 1.2 && CUDA >= 5.0 && FP64 >= 128)

This will unlock a lot of potential with existing and new donors and allows researchers to bring bleeding edge science without any negative impact or worries.

The Server side code don't have to remember all the details on how to process the GPUs, just the logic expansion, handling of GPUs.JSON file, and generating a checksum of the GPUs.JSON file.

Client code can be updated to only add a GPU slot if the Status == Supported and OS != macOS. If the Donor attempts to add to a GPU where Status == Unsupported, a friendly message appears saying that GPU is unsupported and to contact the Forum for additional information.

The donors get a much experience when configuring their systems.

This solution has the potential to solve all these issues in a much more meaningful and efficient manner: https://github.com/FoldingAtHome/fah-issues/issues/1479 https://github.com/FoldingAtHome/fah-issues/issues/1452 https://github.com/FoldingAtHome/fah-issues/issues/1390 https://github.com/FoldingAtHome/fah-issues/issues/1361 https://github.com/FoldingAtHome/fah-issues/issues/1271 https://github.com/FoldingAtHome/fah-issues/issues/1258 https://github.com/FoldingAtHome/fah-issues/issues/1220


bb30994 commented 4 years ago

Enhancement request #2: Please add a message in GPUs.txt saying this file is a local cache version and cannot be modified.