[RLlib] gpu cannot enable

Training is slow and may be pending because the GPU may not be enabled. Could this be a problem due to excessive memory requirements?

Here is the code.

  config = (
        PPOConfig().environment(Legalization, env_config = {"designs": train_benchmarks})
        .framework('torch')
        .rollouts(num_rollout_workers = 1)
        .resources(num_gpus = 1)
        .training(model = {'conv_filters': [[32, [50, 5], 2], [64, [50, 5], 3], [128, [50, 5], 3], [1024, [250, 25], 1]]})
        .evaluation(evaluation_interval = 1, evaluation_config=PPOConfig.overrides(
            env_config={
                "designs": test_benchmarks,
            }
        ))
    )

    stop_config = {
        "training_iteration": 1000,
        "episode_reward_mean": -1,
    }

    tuner = tune.Tuner(
        "PPO",
        param_space = config,
        run_config = train.RunConfig(
            stop = stop_config
        )
    )
    results = tuner.fit()
    best_result = results.get_best_result(metric = "episode_reward_mean", mode = "max")
    best_checkpoint = best_result.checkpoint

The following is the log.

Trial status: 1 PENDING Current time: 2024-01-14 08:33:15. Total running time: 42s Logical resource usage: 2.0/16 CPUs, 1.0/1 GPUs (0.0/1.0 accelerator_type:G)

Versions / Dependencies

ray: 2.9.0 torch: 2.1.2

Reproduction script

ray.init(num_gpus = 1)
config = (
        PPOConfig().environment(Legalization, env_config = {"designs": train_benchmarks})
        .framework('torch')
        .rollouts(num_rollout_workers = 1)
        .resources(num_gpus = 1)
        .training(model = {'conv_filters': [[32, [50, 5], 2], [64, [50, 5], 3], [128, [50, 5], 3], [1024, [250, 25], 1]]})
        .evaluation(evaluation_interval = 1, evaluation_config=PPOConfig.overrides(
            env_config={
                "designs": test_benchmarks,
            }
        ))
    )

    stop_config = {
        "training_iteration": 1000,
        "episode_reward_mean": -1,
    }

    tuner = tune.Tuner(
        "PPO",
        param_space = config,
        run_config = train.RunConfig(
            stop = stop_config
        )
    )

    results = tuner.fit()
    best_result = results.get_best_result(metric = "episode_reward_mean", mode = "max")
    best_checkpoint = best_result.checkpoint

Issue Severity

High: It blocks me from completing my task.

ray-project / ray

[RLlib] gpu cannot enable #42388

Versions / Dependencies

Reproduction script

Issue Severity