python3 benchmark.py --suite gpt.perf_test_auto --shard-only
2023-10-18 02:28:19,917 INFO worker.py:1342 -- Connecting to existing Ray cluster at address: 172.17.0.2:6379...
2023-10-18 02:28:19,934 INFO worker.py:1528 -- Connected to Ray cluster.
Working on case: BenchmarkCase(batch_size=1024, model_config=GPTModelConfig(seq_len=1024, hidden_size=2048, num_layers=24, num_heads=32, vocab_size=51200), num_micro_batches=128, parallel_mode='load_solution', parallel_args=LoadSolutionParallelArgs(prefer_reduce_scatter=True, use_remat=True, num_auto_layers=6, forward_stage_layer_ids=[[0, 1, 2], [3, 4, 5]], submesh_physical_shapes=[(1, 2), (1, 2)], submesh_logical_shapes=[(2, 1), (2, 1)], submesh_autosharding_option_dicts=[{'force_batch_dim_to_mesh_dim': 0}, {'force_batch_dim_to_mesh_dim': 0}]))
2023-10-18 02:28:24,970 INFO worker.py:1342 -- Connecting to existing Ray cluster at address: 172.17.0.2:6379...
2023-10-18 02:28:24,983 INFO worker.py:1528 -- Connected to Ray cluster.
Process SpawnProcess-2:
Traceback (most recent call last):
File "/usr/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/usr/lib/python3.8/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/workspace/v-leiwang3/alpa_workspace/alpa/benchmark/alpa/benchmark_one_case.py", line 144, in benchmark_and_write_to_namespace
result = benchmark_one_case_internal(*args, **kwargs)
File "/workspace/v-leiwang3/alpa_workspace/alpa/benchmark/alpa/benchmark_one_case.py", line 53, in benchmark_one_case_internal
result = benchmark_gpt_bert_2d_internal(
File "/workspace/v-leiwang3/alpa_workspace/alpa/benchmark/alpa/benchmark_one_case_gpt_bert.py", line 268, in benchmark_gpt_bert_2d_internal
method, grad_func = get_shard_parallel_method(benchmark_case, physical_mesh)
File "/workspace/v-leiwang3/alpa_workspace/alpa/benchmark/alpa/benchmark_parallel_utils.py", line 182, in get_shard_parallel_method
raise ValueError(f"Unsupported parallel mode: {parallel_mode}")
ValueError: Unsupported parallel mode: load_solution
looks like the the shard-only benchmark only support ShardParallelArgs and UniformParallelArgs.
However, in gpt parallels generation, we only support get_search_cases and get_solution_case.
python3 benchmark.py --suite gpt.perf_test_auto --shard-only
error message:
looks like the the shard-only benchmark only support
ShardParallelArgs
andUniformParallelArgs
.However, in gpt parallels generation, we only support
get_search_cases
andget_solution_case
.any solutions or concern about this issue?