[Improvement] Mutiprocess Evaluation time Bug

Prerequisite

[X] I have searched Issues and Discussions but cannot get the expected help.
[X] The bug has not been fixed in the latest version.

Type

I'm evaluating with the officially supported tasks/models/datasets.

Environment

this is my cpu and gpu, I used the following machine for the test, max-works=32

CPU Info :     255  AMD EPYC 7713 64-Core Processor
GPU Info : NVIDIA H800-SXM4-80GB x 8

Reproduces the problem - code/configuration sample

office code

Reproduces the problem - command or script

I used a 2B model (for example qwen1.5 1.8B) to test 13 datasets, and the model was loaded using Huggingface.
I recoeded the time token for each section, and found that the infer task took about 20minutes and the evaluation task took about 12 minutes.
I found the file to calculate ppl and gen scores (predictions dir), about 500M of memory, so, why does it take 12 minutes in multipreocess? Shouldn't a calculation of 500M be done in about a minute?

I found the code to run the evaluation task (opencompass/runners/local.py line 61 ~ 210 ), and found that the most time-consuming part of the evaluation was the serialization and deserialization of the configuration file (disk landing and loading). The code looks like this :

# opencompass/runners/local.py  line 180 ~ 188
    # Dump task config to file
    mmengine.mkdir_or_exist('tmp/')
    param_file = f'tmp/{os.getpid()}_{index}_params.py'
    try:
        task.cfg.dump(param_file)     # **************  the most time-consuming
        tmpl = get_command_template(gpu_ids)
        get_cmd = partial(task.get_command,
                          cfg_path=param_file,
                          template=tmpl)

When the task was started, I divided the evaluation task and the inference task. The inference task did not change, and the evaluation task eliminated the multi-process operation:
- step 1 : Modify the submit function (opencompass/runners/local.py lines 133)
```
def submit(task, index):
# ...
```
if num_gpus > 0: tqdm.write(f'launch {task.name} on GPU ' + ','.join(map(str, gpu_ids))) else: tqdm.write(f'launch {task.name} on CPU ')

Modify Modify Modify

if "OpenICLEvalTask" in self.task_cfg['type']: res = self._launch_eval(task, gpu_ids, index) else: res = self._launch_infer(task, gpu_ids, index) # old self._launch

pbar.update()

with lock: gpus[gpu_ids] += 1 return res
```
- step 2: new add `self._launch_eval` function
```

def _launch_eval(self, task, gpu_ids, index):
    logger = get_logger()
    task_name = task.name
    out_path = task.get_log_path(file_extension='out')
    mmengine.mkdir_or_exist(osp.split(out_path)[0])

    from opencompass.tasks.openicl_eval import OpenICLEvalTask

    start_time = time.time()
    exitcode = 0
    try:
        inferencer = OpenICLEvalTask(task.cfg)
        inferencer.run()
    except Exception as e:
        print("except: ", e)
        exitcode = 1

    end_time = time.time()
    get_logger().info(f'time elapsed: {end_time - start_time:.2f}s')

    if exitcode != 0:
        logger.error(f'exitcode {exitcode}, task {task_name} failed, see\n{out_path}')

    return task_name, exitcode

It took me 40 seconds to run the 13 datasets before the revised code evaluation.
so, i hope the authorities can improve this bug. And I am now modifying the code so that logs cannot be written to the evaluated datasets logs, so PR is not created!

Reproduces the problem - error message

Evaluation time improvement

Other information

No response

open-compass / opencompass

[Improvement] Mutiprocess Evaluation time Bug #1115