OpenDriveLab / OpenLane-V2

[NeurIPS 2023 Track Datasets and Benchmarks] OpenLane-V2: The First Perception and Reasoning Benchmark for Road Driving
https://proceedings.neurips.cc/paper_files/paper/2023/hash/3c0a4c8c236144f1b99b7e1531debe9c-Abstract-Datasets_and_Benchmarks.html
Apache License 2.0
541 stars 65 forks source link

FileNotFoundError: OpenLaneV2SubsetADataset: [Errno 2] No such file or directory: './data/data_dict_subset_A_train.pkl' #93

Closed Benson722 closed 8 months ago

Benson722 commented 8 months ago

Thank you for your great job! When I run the code, some errors occurs:

Command:

./tools/dist_train.sh projects/configs/topomlp_setA_r50_wo_yolov8.py 8 --work-dir=./work_dirs/topomlp_setA_r50_wo_yolov8

Error:

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "./tools/train.py", line 270, in <module>
    main()
  File "./tools/train.py", line 233, in main
    datasets = [build_dataset(cfg.data.train)]
  File "/home/zhangyiqing/mmdetection3d/mmdet3d/datasets/builder.py", line 46, in build_dataset
    dataset = build_from_cfg(cfg, MMDET_DATASETS, default_args)
  File "/home/zhangyiqing/miniconda3/envs/openlanev2/lib/python3.8/site-packages/mmcv/utils/registry.py", line 69, in build_from_cfg
    raise type(e)(f'{obj_cls.__name__}: {e}')
FileNotFoundError: OpenLaneV2SubsetADataset: [Errno 2] No such file or directory: './data/data_dict_subset_A_train.pkl'
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 3 (pid: 2472564) of binary: /home/zhangyiqing/miniconda3/envs/openlanev2/bin/python
/home/zhangyiqing/miniconda3/envs/openlanev2/lib/python3.8/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py:367: UserWarning:

**********************************************************************
               CHILD PROCESS FAILED WITH NO ERROR_FILE
**********************************************************************
CHILD PROCESS FAILED WITH NO ERROR_FILE
Child process 2472564 (local_rank 3) FAILED (exitcode 1)
Error msg: Process failed with exitcode 1
Without writing an error file to <N/A>.
While this DOES NOT affect the correctness of your application,
no trace information about the error will be available for inspection.
Consider decorating your top level entrypoint function with
torch.distributed.elastic.multiprocessing.errors.record. Example:

  from torch.distributed.elastic.multiprocessing.errors import record

  @record
  def trainer_main(args):
     # do train
**********************************************************************
  warnings.warn(_no_error_file_warning_msg(rank, failure))
Traceback (most recent call last):
  File "/home/zhangyiqing/miniconda3/envs/openlanev2/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/home/zhangyiqing/miniconda3/envs/openlanev2/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/zhangyiqing/miniconda3/envs/openlanev2/lib/python3.8/site-packages/torch/distributed/run.py", line 702, in <module>
    main()
  File "/home/zhangyiqing/miniconda3/envs/openlanev2/lib/python3.8/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 361, in wrapper
    return f(*args, **kwargs)
  File "/home/zhangyiqing/miniconda3/envs/openlanev2/lib/python3.8/site-packages/torch/distributed/run.py", line 698, in main
    run(args)
  File "/home/zhangyiqing/miniconda3/envs/openlanev2/lib/python3.8/site-packages/torch/distributed/run.py", line 689, in run
    elastic_launch(
  File "/home/zhangyiqing/miniconda3/envs/openlanev2/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 116, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
  File "/home/zhangyiqing/miniconda3/envs/openlanev2/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 244, in launch_agent
    raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
***************************************
        ./tools/train.py FAILED

It seems a file, named"data_dict_subset_A_train.pkl", missed. Look forward to your reply! Thank you!

sephyli commented 8 months ago

It looks like you are using TopoMLP code. Please try to run preprocess in your OpenLane-V2 repo to generate data_dict_subset_A_train.pkl. And then link the folder to TopoMLP repo.