microsoft / nni

An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
https://nni.readthedocs.io
MIT License
14k stars 1.81k forks source link

Local GPU Allocation #5639

Open busFred opened 1 year ago

busFred commented 1 year ago

Describe the issue: Currently, user can only impose gpu resource constraint using useActiveGpu, maxTrialNumberPerGpu, trialGpuNumber, and gpuIndices in local mode. However, modern gpu have very large memory and a lot of workstations have multiple computers. It is necessary and would be beneficial to allow factional gpu resources allocation, i.e. 10 tasks with 5 of them only using the first gpu and the second half only using the second gpu; this feature is similar to ray tune fractional resources.

Environment:

Configuration:

trial_concurrency: 20
max_trial_number: 30

training_service:
  platform: local
  useActiveGpu: True
  gpuIndices: 0

Log message: Doesn't matter here.

How to reproduce it?: