microsoft / DeepSpeedExamples

Example models using DeepSpeed
Apache License 2.0
5.83k stars 990 forks source link

Question about loading Dahous dataset from local path. #796

Open Zhutianyi7230 opened 8 months ago

Zhutianyi7230 commented 8 months ago

I have put the Dahous/rm-static dataset as well as the the model facebook/opt-1.3b under the path

DeepSpeedExamples/applications/DeepSpeed-Chat/training/step1_supervised_finetuning When running the command bash training_scripts/opt/single_gpu/run_1.3b.sh It seems there are some troubles loading the local dataset:

[2023-11-01 14:53:49,928] [INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to cuda (auto detect)
[2023-11-01 14:53:57,162] [WARNING] [runner.py:203:fetch_hostfile] Unable to find hostfile, will proceed with training with local resources only.
[2023-11-01 14:53:57,219] [INFO] [runner.py:570:main] cmd = /home/l00841998/anaconda3/envs/deepspeed/bin/python -u -m deepspeed.launcher.launch --world_info=eyJsb2NhbGhvc3QiOiBbMF19 --master_addr=127.0.0.1 --master_port=29500 --enable_each_rank_log=None main.py --model_name_or_path facebook/opt-1.3b --gradient_accumulation_steps 8 --lora_dim 128 --zero_stage 0 --enable_tensorboard --tensorboard_path ./output --deepspeed --output_dir ./output
[2023-11-01 14:53:58,330] [INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to cuda (auto detect)
[2023-11-01 14:53:59,506] [INFO] [launch.py:145:main] WORLD INFO DICT: {'localhost': [0]}
[2023-11-01 14:53:59,506] [INFO] [launch.py:151:main] nnodes=1, num_local_procs=1, node_rank=0
[2023-11-01 14:53:59,506] [INFO] [launch.py:162:main] global_rank_mapping=defaultdict(<class 'list'>, {'localhost': [0]})
[2023-11-01 14:53:59,506] [INFO] [launch.py:163:main] dist_world_size=1
[2023-11-01 14:53:59,506] [INFO] [launch.py:165:main] Setting CUDA_VISIBLE_DEVICES=0
[2023-11-01 14:54:01,378] [INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to cuda (auto detect)
/home/l00841998/anaconda3/envs/deepspeed/lib/python3.8/site-packages/transformers/deepspeed.py:23: FutureWarning: transformers.deepspeed module is deprecated and will be removed in a future version. Please import deepspeed modules directly from transformers.integrations
  warnings.warn(
[2023-11-01 14:54:03,464] [INFO] [comm.py:637:init_distributed] cdb=None
[2023-11-01 14:54:03,465] [INFO] [comm.py:668:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl
Traceback (most recent call last):
  File "main.py", line 398, in <module>
    main()
  File "main.py", line 268, in main
    train_dataset, eval_dataset = create_prompt_dataset(
  File "/home/l00841998/DeepSpeedExamples/applications/DeepSpeed-Chat/training/utils/data/data_utils.py", line 291, in create_prompt_dataset
    train_dataset, eval_dataset = create_dataset(
  File "/home/l00841998/DeepSpeedExamples/applications/DeepSpeed-Chat/training/utils/data/data_utils.py", line 233, in create_dataset
    raw_dataset = get_raw_dataset(dataset_name, output_path, seed, local_rank)
  File "/home/l00841998/DeepSpeedExamples/applications/DeepSpeed-Chat/training/utils/data/data_utils.py", line 24, in get_raw_dataset
    return raw_datasets.DahoasRmstaticDataset(output_path, seed,
  File "/home/l00841998/DeepSpeedExamples/applications/DeepSpeed-Chat/training/utils/data/raw_datasets.py", line 54, in __init__
    super().__init__(output_path, seed, local_rank, dataset_name)
  File "/home/l00841998/DeepSpeedExamples/applications/DeepSpeed-Chat/training/utils/data/raw_datasets.py", line 20, in __init__
    self.raw_datasets = load_from_disk(dataset_name)
  File "/home/l00841998/anaconda3/envs/deepspeed/lib/python3.8/site-packages/datasets/load.py", line 2252, in load_from_disk
    raise FileNotFoundError(
FileNotFoundError: Directory Dahoas/rm-static is neither a `Dataset` directory nor a `DatasetDict` directory.
linchen111 commented 7 months ago

+1

CerrieJ commented 7 months ago

+10086

EeyoreLee commented 6 months ago

@CerrieJ @linchen111 @Zhutianyi7230 - hi all

  File "/home/l00841998/DeepSpeedExamples/applications/DeepSpeed-Chat/training/utils/data/raw_datasets.py", line 20, in __init__
    self.raw_datasets = load_from_disk(dataset_name)

change the load_from_disk to load_dataset can address it, load_from_disk is intended to be used on directories created with Dataset.save_to_disk or DatasetDict.save_to_disk to reload once we download the dataset through the code.

MrRace commented 6 months ago

@CerrieJ @linchen111 @Zhutianyi7230 - hi all

  File "/home/l00841998/DeepSpeedExamples/applications/DeepSpeed-Chat/training/utils/data/raw_datasets.py", line 20, in __init__
    self.raw_datasets = load_from_disk(dataset_name)

change the load_from_disk to load_dataset can address it, load_from_disk is intended to be used on directories created with Dataset.save_to_disk or DatasetDict.save_to_disk to reload once we download the dataset through the code.

Can not work for me~

EeyoreLee commented 6 months ago

@MrRace - the arg dataset_name should be like /media/datasets/full-hh-rlhf/data.

MrRace commented 6 months ago

@MrRace - the arg dataset_name should be like /media/datasets/full-hh-rlhf/data.

Thanks a lot!

ouleiwa commented 2 months ago

@CerrieJ @linchen111 @Zhutianyi7230 - hi all

  File "/home/l00841998/DeepSpeedExamples/applications/DeepSpeed-Chat/training/utils/data/raw_datasets.py", line 20, in __init__
    self.raw_datasets = load_from_disk(dataset_name)

change the load_from_disk to load_dataset can address it, load_from_disk is intended to be used on directories created with Dataset.save_to_disk or DatasetDict.save_to_disk to reload once we download the dataset through the code.

it works for me. thanks!

chaoshunh commented 4 days ago

can you let me know where or which directory you put Dahous/rm-static, are you using download or save_to_disk('directory')?