Open Zhutianyi7230 opened 1 year ago
+1
+10086
@CerrieJ @linchen111 @Zhutianyi7230 - hi all
File "/home/l00841998/DeepSpeedExamples/applications/DeepSpeed-Chat/training/utils/data/raw_datasets.py", line 20, in __init__
self.raw_datasets = load_from_disk(dataset_name)
change the load_from_disk
to load_dataset
can address it, load_from_disk
is intended to be used on directories created with Dataset.save_to_disk
or DatasetDict.save_to_disk
to reload once we download the dataset through the code.
@CerrieJ @linchen111 @Zhutianyi7230 - hi all
File "/home/l00841998/DeepSpeedExamples/applications/DeepSpeed-Chat/training/utils/data/raw_datasets.py", line 20, in __init__ self.raw_datasets = load_from_disk(dataset_name)
change the
load_from_disk
toload_dataset
can address it,load_from_disk
is intended to be used on directories created withDataset.save_to_disk
orDatasetDict.save_to_disk
to reload once we download the dataset through the code.
Can not work for me~
@MrRace - the arg dataset_name
should be like /media/datasets/full-hh-rlhf/data
.
@MrRace - the arg
dataset_name
should be like/media/datasets/full-hh-rlhf/data
.
Thanks a lot!
@CerrieJ @linchen111 @Zhutianyi7230 - hi all
File "/home/l00841998/DeepSpeedExamples/applications/DeepSpeed-Chat/training/utils/data/raw_datasets.py", line 20, in __init__ self.raw_datasets = load_from_disk(dataset_name)
change the
load_from_disk
toload_dataset
can address it,load_from_disk
is intended to be used on directories created withDataset.save_to_disk
orDatasetDict.save_to_disk
to reload once we download the dataset through the code.
it works for me. thanks!
can you let me know where or which directory you put Dahous/rm-static, are you using download or save_to_disk('directory')?
use:
from datasets import load_dataset ds = load_dataset("Hello-SimpleAI/HC3-Chinese") ds.save_to_disk('Hello-SimpleAI/HC3-Chinese')
then: use the saved directory, it works for me
I have put the
Dahous/rm-static
dataset as well as the the modelfacebook/opt-1.3b
under the pathDeepSpeedExamples/applications/DeepSpeed-Chat/training/step1_supervised_finetuning When running the command
bash training_scripts/opt/single_gpu/run_1.3b.sh
It seems there are some troubles loading the local dataset: